A Parameterization of Models for Unit Root Processes: Structure Theory and Hypothesis Testing

Dietmar Bauer; Lukas Matuschek; Patrick de Matos Ribeiro; Martin Wagner

doi:10.3390/econometrics8040042

Abstract

We develop and discuss a parameterization of vector autoregressive moving average processes with arbitrary unit roots and (co)integration orders. The detailed analysis of the topological properties of the parameterization—based on the state space canonical form of Bauer and Wagner (2012)—is an essential input for establishing statistical and numerical properties of pseudo maximum likelihood estimators as well as, e.g., pseudo likelihood ratio tests based on them. The general results are exemplified in detail for the empirically most relevant cases, the (multiple frequency or seasonal) I(1) and the I(2) case. For these two cases we also discuss the modeling of deterministic components in detail.

Keywords:

canonical form; cointegration; hypothesis testing; parameterization; state space representation; unit roots

1. Introduction

Since the seminal contribution of Clive W.J. Granger (1981) that introduced the concept of cointegration, the modeling of multivariate (economic) time series with models and methods that allow for unit roots and cointegration has become standard econometric practice with applications ranging from macroeconomics to finance to climate science.

The most prominent (parametric) model class for cointegration analysis are vector autoregressive (VAR) models, popularized by the important contributions of Søren Johansen and Katarina Juselius and their co-authors, see, e.g., the monographs Johansen (1995) and Juselius (2006). The popularity of VAR cointegration analysis stems not only from the (relative) simplicity of the model class, but also from the fact that the VAR cointegration literature is very well-developed and provides a large battery of tools for diagnostic testing, impulse response analysis, forecast error variance decompositions and the like. All this makes VAR cointegration analysis to a certain extent the benchmark in the literature.1

The imposition of specific cointegration properties on an estimated VAR model becomes increasingly complicated as one moves away from the I(1) case. As discussed in Section 2, e.g., in the I(2) case a triple of indices needs to be chosen (fixed or determined via testing) to describe the cointegration properties. The imposition of cointegration properties in the estimation algorithm then leads to “switching” type algorithms that come together with non-trivial parameterization restrictions involving non-linear inter-relations, compare Paruolo (1996) or Paruolo (2000).2 Mathematically, these complications arise from the fact that the unit root and cointegration properties are in the VAR setting related to rank restrictions on the autoregressive polynomial matrix and its derivatives.

Restricting cointegration analysis to VAR processes may be too restrictive. First, it is well-known since Zellner and Palm (1974) that VAR processes are not invariant with respect to marginalization, i.e., subsets of the variables of a VAR process are in general vector autoregressive moving average (VARMA) processes. Second, similar to the first argument, aggregation of VAR processes also leads to VARMA processes, an issue relevant, e.g., in the context of temporal aggregation and in mixed-frequency settings. Third, the linearized solutions to dynamic stochastic general equilibrium (DSGE) models are typically VARMA rather than VAR processes, see, e.g., Campbell (1994). Fourth, a VARMA model may be a more parsimonious description of the data generating process (DGP) than a VAR model, with parsimony becoming more important with increasing dimension of the process.3

If one accepts the above arguments as a motivation for considering VARMA processes in cointegration analysis, it is convenient to move to the—essentially equivalent (see Hannan and Deistler 1988, chps. 1 and 2)—state space framework. A key challenge when moving from VAR to VARMA models—or state space models—is that identification becomes an important issue for the latter model class, whereas unrestricted VAR models are (reduced-form) identified. In other words, there are so-called equivalence classes of VARMA models that lead to the same dynamic behavior of the observed process. As is well-known, to achieve identification, restrictions have to be placed on the coefficient matrices in the VARMA case, e.g., zero or exclusion restrictions. A mapping attaching to every transfer function, i.e, the function relating the error sequence to the observed process, a unique VARMA (or state space) system from the corresponding class of observationally equivalent systems is called canonical form. Since not all entries of the coefficient matrices in canonical form are free parameters, for statistical analysis a so-called parameterization is required that maps the free parameters from coefficient matrices in canonical form into a parameter vector. These issues, including the importance of the properties such as continuity and differentiability of parameterizations, are discussed in detail in Hannan and Deistler (1988, chp. 2) and, of course, are also relevant for our setting in this paper.

The convenience of the state space framework for unit root and cointegration analysis stems from the fact that (static and dynamic) cointegration can be characterized by orthogonality constraints, see Bauer and Wagner (2012), once an appropriate basis for the state vector, which is a (potentially singular) VAR process of order one, is chosen. The integration properties are governed by the eigenvalue structure of unit modulus eigenvalues of the system matrix in the state equation. Eigenvalues of unit modulus and orthogonality constraints arguably are easier restrictions to deal with or to implement than the interrelated rank restrictions considered in the VAR or VARMA setting. The canonical form of Bauer and Wagner (2012) is designed for cointegration analysis by using a basis of the state vector that puts the unit root and cointegration properties to the center and forefront. Consequently, these results are key input for the present paper and are thus briefly reviewed in Section 3.

An important problem with respect to appropriately defining the “free parameters” in VARMA models is the fact that no continuous parameterization of all VARMA or state space models of a certain order n exists in the multivariate case (see Hazewinkel and Kalman 1976). This implies that the model set,

M_{n}

say, has to be partitioned into subsets on which continuous parameterizations exist, i.e.,

M_{n} = ⋃_{Γ \in G} M_{Γ}

for some multi-index

Γ

varying in an index set G. Based on the canonical form of Bauer and Wagner (2012), the partitioning is according to systems—in addition to other restrictions such as fixed order n—with fixed unit root properties, to be precise over systems with given state space unit root structure. This has the advantage that, e.g., pseudo maximum likelihood (PML) estimation can straightforwardly be performed over systems with fixed unit root properties without any further ado, i.e., without having to consider (or ignore) rank restrictions on polynomial matrices. The definition and detailed discussion of the properties of this parameterization is the first main result of the paper.

The second main set of results, provided in Section 4, is a detailed discussion of the relationships between the different subsets of models

M_{Γ}

for different indices

Γ

and the parameterization of the respective model sets. Knowledge concerning these relations is important to understand the asymptotic behavior of PML estimators and pseudo likelihood ratio tests based on them. In particular, the structure of the closures of M,

\bar{M}

say, of the considered model set M has to be understood, since the difference

\bar{M} \ M

cannot be avoided when maximizing the pseudo likelihood function4. Additionally, the inclusion properties between different sets

M_{Γ}

need to be understood, as this knowledge is important for developing hypothesis tests, in particular for developing hypothesis tests for the dimensions of cointegrating spaces. Hypotheses testing, with a focus on the MFI(1) and I(2) cases, is discussed in Section 5, which shows how the parameterization results of the paper can be used to formulate a large number of hypotheses on (static and polynomial) cointegrating relationships as considered in the VAR cointegration literature. This discussion also includes commonly used deterministic components such as intercept, seasonal dummies, and linear trend, as well as restrictions on these components.

The paper is organized as follows: Section 2 briefly reviews VAR and VARMA models with unit roots and cointegration and discusses some of the complications arising in the VARMA case in addition to the complications arising due to the presence of unit roots and cointegration already in the VAR case. Section 3 presents the canonical form and the parameterization based on it, with the discussion starting with the multiple frequency I(1)—MFI(1)—and I(2) cases prior to a discussion of the general case. This section also provides several important definitions like, e.g., of the state space unit root structure. Section 4 contains a detailed discussion concerning the topological structure of the model sets and Section 5 discusses testing of a large number of hypotheses on the cointegrating spaces commonly tested in the cointegration literature. The discussion in Section 5 focuses on the empirically most relevant MFI(1) and I(2) cases and includes the usual deterministic components considered in the literature. Section 6 briefly summarizes and concludes the paper. All proofs are relegated to the Appendix A and Appendix B.

Throughout we use the following notation: L denotes the lag operator, i.e.,

L ({x_{t}}_{t \in Z}) : = {x_{t - 1}}_{t \in Z}

, for brevity written as

L x_{t} = x_{t - 1}

. For a matrix

γ \in C^{s \times r}

,

γ^{'} \in C^{r \times s}

denotes its conjugate transpose. For

γ \in C^{s \times r}

with full column rank

r < s

, we define

γ_{⊥} \in C^{s \times (s - r)}

of full column rank such that

γ^{'} γ_{⊥} = 0

.

I_{p}

denotes the p-dimensional identity matrix,

0_{m \times n}

the m times n zero matrix. For two matrices

A \in C^{m \times n}, B \in C^{k \times l}

,

A \otimes B \in C^{m k \times n l}

denotes the Kronecker product of A and B. For a complex valued quantity x,

R (x)

denotes its real part,

I (x)

its imaginary part and

\bar{x}

its complex conjugate. For a set V,

\bar{V}

denotes its closure.5 For two sets V and W,

V \ W

denotes the difference of V and W, i.e.,

{v \in V : v \notin W}

. For a square matrix A we denote the spectral radius (i.e., the maximum of the moduli of its eigenvalues) by

λ_{| \max |} (A)

and by

det (A)

its determinant.

2. Vector Autoregressive, Vector Autoregressive Moving Average Processes and Parameterizations

In this paper, we define VAR processes

{y_{t}}_{t \in Z}

,

y_{t} \in R^{s}

, as solution of

\begin{matrix} a (L) y_{t} = y_{t} + \sum_{j = 1}^{p} a_{j} y_{t - j} = ε_{t} + Φ d_{t}, \end{matrix}

(1)

with

a (L) : = I_{s} + \sum_{j = 1}^{p} a_{j} L^{j}

, where

a_{j} \in R^{s \times s}

for

j = 1, \dots, p

,

Φ \in R^{s \times m}

,

a_{p} \neq 0

, a white noise process

{ε_{t}}_{t \in Z}

,

ε_{t} \in R^{s}

, with

Σ : = E (ε_{t} ε_{t}^{'}) > 0

and a vector sequence

{d_{t}}_{t \in Z}

,

d_{t} \in R^{m}

, comprising deterministic components like, e.g., the intercept, seasonal dummies or a linear trend. Furthermore, we impose the non-explosiveness condition

det a (z) \neq 0

for all

| z | < 1

, with

a (z) : = I_{s} + \sum_{j = 1}^{p} a_{j} z^{j}

and z denoting a complex variable.6

Thus, for given autoregressive order p, with—as defining characteristic of the order—

a_{p} \neq 0

, the considered class of VAR models with specified deterministic components

{d_{t}}_{t \in Z}

is given by the set of all polynomial matrices

a (z)

such that (i) the non-explosiveness condition holds, (ii)

a (0) = I_{s}

and (iii)

a_{p} \neq 0

; together with the set of all matrices

Φ \in R^{s \times m}

.

Equivalently, the model class can be characterized by a set of rational matrix functions

k (z) : = a {(z)}^{- 1}

, referred to as transfer functions, and the input-output description for the deterministic variables, i.e.,

\begin{matrix} V_{p, Φ} & : = & V_{p} \times R^{s \times m}, \\ V_{p} & : = & \{k (z) = \sum_{j = 0}^{\infty} k_{j} z^{j} = a {(z)}^{- 1} : a (z) = I_{s} + \sum_{j = 1}^{p} a_{j} z^{j}, det a (z) \neq 0 for | z | < 1, a_{p} \neq 0\} . \end{matrix}

The associated parameter space is

Θ_{p, Φ} : = Θ_{p} \times R^{s m} \subset R^{s^{2} p + s m}

, where the parameters

\begin{matrix} θ & : = & {[θ_{a}^{'}, θ_{Φ}^{'}]}^{'} = {[vec {(a_{1})}^{'}, \dots, vec {(a_{p})}^{'}, vec {(Φ)}^{'}]}^{'} \end{matrix}

(2)

are obtained from stacking the entries of the matrices

a_{j}

and

Φ

, respectively.

Remark 1.

In the above discussion the parameters,

θ_{Σ}

say, describing the variance covariance matrix Σ of

ε_{t}

are not considered. These can be easily included, similarly to Φ by, e.g., parameterizing positive definite symmetric

s \times s

matrices via their lower triangular Cholesky factor. This leads to a parameter space

Θ_{p, Φ, Σ} \subset R^{s^{2} p + s m + \frac{s (s + 1)}{2}}

. We omit

θ_{Σ}

for brevity, since typically no cross-parameter restrictions involving parameters corresponding to Σ are considered, whereas as discussed in Section 5 parameter restrictions involving—in this paper in the state space rather than the VAR setting—both elements of

Θ_{p}

and Φ, to, e.g., impose the absence of a linear trend in the cointegrating space, are commonly considered in the cointegration literature.7 The estimator of the variance covariance matrix Σ often equals the sample variance of suitable residuals

{\hat{ε}}_{t} (θ)

from (1), if there are no cross-restrictions between θ and

θ_{Σ}

. This holds, e.g., for the Gaussian pseudo maximum likelihood estimator. Thus, explicitly including

θ_{Σ}

and

Θ_{Σ}

in the discussion would only overload notation without adding any additional insights, given the simple nature of the parameterization of Σ.

Remark 2.

Our consideration of deterministic components is a special case of including exogenous variables. We include exogenous deterministic variables with a static input-output behavior governed solely by the matrix Φ. More general exogenous variables that are dynamically related to the output

{y_{t}}_{t \in Z}

could be considered, thereby considering so-called VARX models rather than VAR models, which would necessitate considering in addition to the transfer function

k (z)

also a transfer function

l (z)

, say, linking the exogenous variables dynamically to the output.

For the VAR case, the fact that the mapping assigning a given transfer function

k (z) \in V_{p}

, to a parameter vector

θ_{a} \in Θ_{p}

—the parameterization—is continuous with continuously differentiable inverse is immediate.8 Homeomorphicity of a parameterization is important for the properties of parameter estimators, e.g., the ordinary least squares (OLS) or Gaussian PML estimator, compare the discussion in Hannan and Deistler (1988, Theorem 2.5.3 and Remark 1, p. 65).

For OLS estimation one typically considers the larger set

V_{p}^{O L S}

without the non-explosiveness condition and without the assumption

a_{p} \neq 0

:

\begin{matrix} V_{p}^{O L S} & : = & \{k (z) = \sum_{j = 0}^{\infty} k_{j} z^{j} = a {(z)}^{- 1} : a (z) = I_{s} + \sum_{j = 1}^{p} a_{j} z^{j}\} . \end{matrix}

Considering

V_{p}^{O L S}

allows for unconstrained optimization. It is well-known that for

{ε_{t}}_{t \in Z}

as given above, the OLS estimator is consistent over the larger set

V_{p}^{O L S}

, i.e., without imposing non-explosiveness and also when specifying p too high. Alternatively, and closely related to OLS in the VAR case, the pseudo likelihood can be maximized over

Θ_{p, Φ}

. With this approach, maxima respectively suprema can occur at the boundary of the parameter space, i.e., maximization effectively has to consider

{\bar{Θ}}_{p, Φ}

. It is well-known that the PML estimator is consistent for the stable case (cf. Hannan and Deistler 1988, Theorem 4.2.1), but the maximization problem is complicated by the restrictions on the parameter space stemming from the non-explosiveness condition. Avoiding these complications and asymptotic equivalence of OLS and PML in the stable VAR case explains why VAR models are usually estimated by OLS.9

To be more explicit, ignore deterministic components for a moment and consider the case where the DGP is a stationary VAR process, i.e., a solution of (1) with

a (z)

satisfying the stability condition

det a (z) \neq 0

for

| z | \leq 1

. Define the corresponding set of stable transfer functions by

V_{p, •}

:

\begin{matrix} V_{p, •} & : = & \{a {(z)}^{- 1} \in V_{p} : det a (z) \neq 0 for | z | \leq 1, a_{p} \neq 0\} . \end{matrix}

Clearly,

V_{p, •}

is an open subset of

V_{p}

. If the DGP is a stationary VAR process, the above-mentioned consistency result of the OLS estimator over

V_{p}^{O L S}

implies that the probability that the estimated transfer function,

\hat{k} (z) = \hat{a} {(z)}^{- 1}

say, is contained in

V_{p, •}

converges to one as the sample size tends to infinity. Moreover, the asymptotic distribution of the estimated parameters is normal, under appropriate assumptions on

{ε_{t}}_{t \in Z}

.

The situation is a bit more involved if the transfer function of the DGP corresponds to a point in the set

{\bar{V}}_{p, •} \ V_{p, •}

, which contains systems with unit roots, i.e., determinantal roots of

a (z)

on the unit circle, as well as lower order autoregressive systems—with these two cases non-disjoint. The stable lower order case is relatively unproblematic from a statistical perspective. If, e.g., OLS estimation is performed over

V_{p}^{O L S}

, while the true model corresponds to an element in

V_{p^{*}, •}

, with

p^{*} < p

, the OLS estimator is still consistent, since

V_{p^{*}, •} \subset V_{p}^{O L S}

. Furthermore, standard chi-squared pseudo likelihood ratio test based inference still applies. The integrated case, for a precise definition see the discussion below Definition 1, is a bit more difficult to deal with, as in this case not all parameters are asymptotically normally distributed and nuisance parameters may be present. Consequently, parameterizations that do not take the specific nature of unit root processes into account are not very useful for inference in the unit root case, see, e.g., Sims et al. (1990, Theorem 1). Studying the unit root and cointegration properties is facilitated by resorting to suitable parameterizations that “zoom in on the relevant characteristics”.

In case that the only determinantal root of

a (z)

on the unit circle is at

z = 1

, the system corresponds to a so-called

I (d)

process, with the integration order

d > 0

made precise in Definition 1 below. Consider first the I(1) case: As is well-known, the rank of the matrix

a (1)

equals the dimension of the cointegrating space given in Definition 3 below—also referred to as the cointegrating rank. Therefore, determination of the rank of this matrix is of key importance. With the parameterization used so far, imposing a certain (maximal) rank on

a (1)

implies complicated restrictions on the matrices

a_{j}, j = 1, \dots, p

. This in turn renders the correspondingly restricted optimization unnecessarily complicated and not conducive to develop tests for the cointegrating rank. It is more convenient to consider the so-called vector error correction model (VECM) representation of autoregressive processes, discussed in full detail in the monograph Johansen (1995). To this end let us first introduce the differencing operator at frequency

0 \leq ω \leq π

\begin{matrix} Δ_{ω} & : = & \{\begin{matrix} I_{s} - 2 cos (ω) L + L^{2} & for 0 < ω < π \\ I_{s} - cos (ω) L & for ω \in {0, π} \end{matrix} . \end{matrix}

(3)

For notational brevity, we omit the dependence on L in

Δ_{ω} (L)

, henceforth denoted as

Δ_{ω}

. Using this notation, the I(1) error correction representation is given by

\begin{matrix} Δ_{0} y_{t} & = & Π y_{t - 1} + \sum_{j = 1}^{p - 1} Γ_{j} Δ_{0} y_{t - j} + ε_{t} + Φ d_{t} \\ = & α β^{'} y_{t - 1} + \sum_{j = 1}^{p - 1} Γ_{j} Δ_{0} y_{t - j} + ε_{t} + Φ d_{t}, \end{matrix}

(4)

with the matrix

Π : = - a (1) = - (I_{s} + \sum_{j = 1}^{p} a_{j})

of rank

0 \leq r \leq s

factorized into the product of two full rank matrices

α, β \in R^{s \times r}

and

Γ_{j} : = \sum_{m = j + 1}^{p} a_{m}

,

j = 1, \dots, p - 1

.

This constitutes a reparameterization, where

k (z) \in V_{p}

is now represented by the matrices

(α, β, Γ_{1}, \dots, Γ_{p - 1})

and a corresponding parameter vector

θ_{a}^{VECM} \in Θ_{p, r}^{VECM}

. Please note that stacking the entries of the matrices does not lead to a homeomorphic mapping from

V_{p}

to

Θ_{p, s}^{VECM}

, since for

0 < r \leq s

the matrices

α

and

β

are not identifiable from the product

α β^{'}

, since

α β^{'} = α M M^{- 1} β^{'} = \tilde{α} {\tilde{β}}^{'}

for all regular matrices

M \in R^{r \times r}

. One way to obtain identifiability is to introduce the restriction

β = {[I_{r}, β^{*'}]}^{'}

, with

β^{*} \in R^{(s - r) \times r}

and

α \in R^{s \times r}

. With this additional restriction the parameter vector

θ_{a}^{VECM}

is given by stacking the vectorized matrices

α, β^{*}, Γ_{1}, \dots, Γ_{p - 1}

, similarly to (2). Then

Θ_{p, r, Φ}^{VECM} = Θ_{p, r}^{VECM} \times R^{s m} \subset R^{p s^{2} - {(s - r)}^{2} + s m}

. Note for completeness that the normalization of

β = {[I_{r}, β^{*'}]}^{'}

may necessitate a re-ordering of the variables in

{y_{t}}_{t \in Z}

since—without potential reordering—this parameterization implies a restriction of generality as, e.g., processes, where the first variable is integrated, but does not cointegrate with the other variables, cannot be represented.

Define the following sets of transfer functions:

\begin{matrix} V_{p, r} & : = & \{a {(z)}^{- 1} \in V_{p} : det a (z) \neq 0 for {z : | z | = 1, z \neq 1}, rank (a (1)) \leq r\}, \\ V_{p, r}^{R R R} & : = & \{a {(z)}^{- 1} \in V_{p}^{O L S} : rank (a (1)) \leq r\} . \end{matrix}

The dimension of the parameter vector

θ_{a}^{VECM}

depends on the dimension of the cointegrating space, thus the parameterization of

k (z) \in V_{p, r}

depends on r. The so-called reduced rank regression (RRR) estimator, given by the maximizer of the pseudo likelihood over

V_{p, r}^{R R R}

is consistent, see, e.g., Johansen (1995, chp. 6). The RRR estimator uses an “implicit” normalization of

β

and thereby implicitly addresses the mentioned identification problem. However, for testing hypotheses involving the free parameters in

α

or

β

, typically the identifying assumption given above is used, as discussed in Johansen (1995, chp. 7).

Furthermore, since

V_{p, r} \subset V_{p, r^{*}}

for

r < r^{*} \leq s

, with

Θ_{p, r}^{VECM}

a lower dimensional subset of

Θ_{p, r^{*}}^{VECM}

, pseudo likelihood ratio testing can be used to sequentially test for the rank r, starting with the hypothesis of a rank

r = 0

against the alternative of a rank

0 < r \leq s

, and increasing the assumed rank consecutively until the null hypothesis is not rejected.

Ensuring that

{y_{t}}_{t \in Z}

generated from (4) is indeed an I(1) process, requires on the one hand that

Π

is of reduced rank, i.e.,

r < s

and on the other that the matrix

\begin{matrix} α_{⊥}^{'} Γ β_{⊥} & : = & α_{⊥}^{'} (I_{s} - \sum_{j = 1}^{p - 1} Γ_{j}) β_{⊥} \end{matrix}

(5)

has full rank. It is well-known that condition (5) is fulfilled on the complement of a “thin” algebraic subset of

V_{p, r}^{R R R}

, and is therefore, ignored in estimation, as it is “generically” fulfilled.10

The I(2) case is similar in structure to the I(1) case, but with two rank restrictions and one full rank condition to exclude even higher integration orders. The corresponding VECM is given by

\begin{matrix} Δ_{0}^{2} y_{t} & = & α β^{'} y_{t - 1} - Γ Δ_{0} y_{t - 1} + \sum_{j = 1}^{p - 2} Ψ_{j} Δ_{0}^{2} y_{t - j} + ε_{t}, \end{matrix}

(6)

with

α, β

as defined in (4),

Γ

as defined in (5) and

Ψ_{j} : = - \sum_{k = j + 1}^{p - 1} Γ_{k}

,

j = 1, \dots, p - 2

. From (5) we already know that reduced rank of

\begin{matrix} α_{⊥}^{'} Γ β_{⊥} & = : & ξ η^{'}, \end{matrix}

(7)

with

ξ, η \in R^{(s - r) \times m}

,

m < s - r

is required for higher integration orders. The condition for the corresponding solution process

{y_{t}}_{t \in Z}

to be an I(2) process is given by full rank of

\begin{matrix} ξ_{⊥}^{'} α_{⊥}^{'} (Γ β {(β^{'} β)}^{- 1} {(α^{'} α)}^{- 1} α^{'} Γ + I_{s} - \sum_{j = 1}^{p - 2} Ψ_{j}) β_{⊥} η_{⊥}, \end{matrix}

which again is typically ignored in estimation, just like condition (5) in the I(1) case. Thus, I(2) processes correspond to a “thin subset” of

V_{p, r}^{R R R}

, which in turn constitutes a “thin subset” of

V_{p}^{O L S}

. The fact that integrated processes correspond to “thin sets” in

V_{p}^{O L S}

implies that obtaining estimated systems with specific integration and cointegration properties requires restricted estimation based on parameterizations tailor made to highlight these properties.

Already for the I(2) case, formulating parameterizations that allow conveniently studying the integration and cointegration properties is a quite challenging task. Johansen (1997) contains several different (re-)parameterizations for the I(2) case and Paruolo (1996) defines “integration indices”,

r_{0}, r_{1}, r_{2}

say, as the number of columns of the matrices

β \in R^{s \times r_{0}}

,

β_{1} : = β_{⊥} η \in R^{s \times r_{1}}

and

β_{2} : = β_{⊥} η_{⊥} \in R^{s \times r_{2}}

. Clearly, the indices

r_{0}, r_{1}, r_{2}

are linked to the ranks of the above matrices

Π

and

α_{⊥}^{'} Γ β_{⊥}

, as

r_{0} = r

and

r_{1} = m

and the columns of

[β, β_{1}, β_{2}]

form a basis of

R^{s}

, such that

s = r_{0} + r_{1} + r_{2}

. It holds that

{β_{2}^{'} y_{t}}_{t \in Z}

is an I(2) process without cointegration and

{β_{1}^{'} y_{t}}_{t \in Z}

is an I(1) process without cointegration. The process

{β^{'} y_{t}}_{t \in Z}

is typically

I (1)

and in this case cointegrates with

{β_{2}^{'} Δ_{0} y_{t}}_{t \in Z}

to stationarity. Thus, there is a direct correspondence of these indices to the dimensions of the different cointegrating spaces—both static and dynamic (with precise definitions given below in Definition 3).11 Please note that again, as already before in the I(1) case, different values of the integration indices

r_{0}, r_{1}, r_{2}

, lead to parameter spaces of different dimensions. Furthermore, in these parameterizations matrices describing different cointegrating spaces are (i) not identified and (ii) linked by restrictions, compare the discussion in Paruolo (2000, sct. 2.2) and (7). These facts render the analysis of the cointegration properties in I(2) VAR systems complicated. Also, in the I(2) VAR case usually some forms of RRR estimators are considered over suitable subsets

V_{p, r, m}^{R R R}

of

V_{p, r}^{R R R}

, again based on implicit normalizations. Inference, however, again requires one to consider parameterizations explicitly.

Estimation and inference issues are fundamentally more complex in the VARMA case than in the VAR case. This stems from the fact that unrestricted estimation—unlike in the VAR case—is not possible due to a lack of identification, as discussed below. This means that in the VARMA case identification and parameterization issues need to be tackled as the first step, compare the discussion in Hannan and Deistler (1988, chp. 2).

In this paper, we consider VARMA processes as solutions of the vector difference equation

\begin{matrix} y_{t} + \sum_{j = 1}^{p} a_{j} y_{t - j} & = & ε_{t} + \sum_{j = 1}^{q} b_{j} ε_{t - j} + Φ d_{t}, \end{matrix}

with

a (L) : = I_{s} + \sum_{j = 1}^{p} a_{j} L^{j}

, where

a_{j} \in R^{s \times s}

for

j = 1, \dots, p

,

a_{p} \neq 0

and the non-explosiveness condition

d e t (a (z)) \neq 0

for

| z | < 1

. Similarly,

b (L) : = I_{s} + \sum_{j = 1}^{q} b_{j} L^{j}

, where

b_{j} \in R^{s \times s}

for

j = 1, \dots, q

,

b_{q} \neq 0

and

Φ \in R^{s \times m}

. The transfer function corresponding to a VARMA process is

k (z) : = a {(z)}^{- 1} b (z)

.

It is well-known that without further restrictions the VARMA realization

(a (z), b (z))

of the transfer function

k (z) = a {(z)}^{- 1} b (z)

is not identified, i.e., different pairs of polynomial matrices

(a (z), b (z))

can realize the same transfer function

k (z)

. It is clear that

k (z) = a {(z)}^{- 1} m {(z)}^{- 1} m (z) b (z) = a {(z)}^{- 1} b (z)

for all non-singular polynomial matrices

m (z)

. Thus, the mapping

π

attaching the transfer function

k (z) = a {(z)}^{- 1} b (z)

to the pair of polynomial matrices

(a (z), b (z))

is not injective.12

Consequently, we refer for given rational transfer function

k (z)

to the class

{(a (z), b (z)) : k (z) = a {(z)}^{- 1} b (z)}

as a class of observationally equivalent VARMA realizations of

k (z)

. To achieve identification requires to define a canonical form, selecting one member of each class of observationally equivalent VARMA realizations for a set of considered transfer functions. A first step towards a canonical form is to only consider left coprime pairs

(a (z), b (z))

.13 However, left coprimeness is not sufficient for identification and thus further restrictions are required, leading to parameter vectors of smaller dimension than

R^{s^{2} (p + q)}

. A widely used canonical form is the (reverse) echelon canonical form, see Hannan and Deistler (1988, Theorem 2.5.1, p. 59), based on (monic) normalizations of the diagonal elements of

a (z)

and degree relationships between diagonal and off-diagonal elements as well as the entries in

b (z)

, which lead to zero restrictions. The (reverse) echelon canonical form in conjunction with a transformation to an error correction model was used in VARMA cointegration analysis in the I(1) case, e.g., in Poskitt (2006, Theorem 4.1), but, as for the VAR case, understanding the interdependencies of rank conditions already becomes complicated once one moves to the I(2) case.

In the VARMA case matters are further complicated by another well-known problem that makes statistical analysis considerably more involved compared to the VAR case. Although there exists a generalization of the autoregressive order to the VARMA case, such that any transfer function corresponding to a VARMA system has an order

n \in N

(with the precise definition given in the next section) it is known since Hazewinkel and Kalman (1976) that no continuous parameterization of all rational transfer functions of order n exists if

s > 1

. Therefore, if one wants to keep the above-discussed advantages that continuity of a parameterization provides, the set of transfer functions of order n, henceforth referred to as

M_{n}

, has to be partitioned into sets on which continuous parameterizations exist, i.e.,

M_{n} = ⋃_{Γ \in G} M_{Γ}

, for some index set G, as already mentioned in the introduction.14 For any given partitioning of the set

M_{n}

it is important to understand the relationships between the different subsets

M_{Γ}

, as well as the closures of the pieces

M_{Γ}

, since in case of misspecification of

M_{Γ}

points in

{\bar{M}}_{Γ} \ M_{Γ}

cannot be avoided even asymptotically in, e.g., pseudo maximum likelihood estimation. These are more complicated issues in the VARMA case than in the VAR case, see the discussion in Hannan and Deistler (1988, Remark 1 after Theorem 2.5.3).

Based on these considerations, the following section provides and discusses a parameterization that focuses on unit root and cointegration properties, resorting to the state space framework that—as mentioned in the introduction—provides advantages for cointegration analysis. In particular, we derive an almost everywhere homeomorphic parameterization, based on partitioning the set of all considered transfer functions according to a multi-index

Γ

that contains, among other elements, the state space unit root structure. This implies that certain cointegration properties are invariant for all systems corresponding to a subset

M_{Γ}

, i.e., the parameterization allows to directly impose cointegration properties such as the “cointegration indices” of Paruolo (1996) mentioned before.

3. The Canonical Form and the Parameterization

As a first step we define the class of VARMA processes considered in this paper, using the differencing operator defined in (3):

Definition 1.

The s-dimensional real VARMA process

{y_{t}}_{t \in Z}

has unit root structure

Ω : = ((ω_{1}, h_{1}), \dots, (ω_{l}, h_{l}))

with

0 \leq ω_{1} < ω_{2} < \dots < ω_{l} \leq π, h_{k} \in N, k = 1, \dots, l, l \geq 1

, if it is a solution of the difference equation

\begin{matrix} Δ_{Ω} (y_{t} - Φ d_{t}) : = \prod_{k = 1}^{l} Δ_{ω_{k}}^{h_{k}} (y_{t} - Φ d_{t}) = v_{t}, \end{matrix}

(8)

where

{d_{t}}_{t \in Z}

is an m-dimensional deterministic sequence,

Φ \in R^{s \times m}

and

{v_{t}}_{t \in Z}

is a linearly regular stationary VARMA process, i.e., there exists a pair of left coprime matrix polynomials

(a (z), b (z)),

det a (z) \neq 0,

| z | \leq 1

such that

v_{t} = a {(L)}^{- 1} b (L) (ε_{t}) = : c (L) (ε_{t})

for a white noise process

{ε_{t}}_{t \in Z}

with

E (ε_{t} ε_{t}^{'}) = Σ > 0

, with furthermore

c (z) \neq 0

for

z = e^{i ω_{k}}, k = 1, \dots, l

.

The process ${y_{t}}_{t \in Z}$ is called unit root process with unit roots $z_{k} : = e^{i ω_{k}}$ for $k = 1, \dots, l$ , the set $F (Ω) : = {ω_{1}, \dots, ω_{l}}$ is the set of unit root frequencies and the integers $h_{k}, k = 1, \dots, l$ are the integration orders.
A unit root process with unit root structure $((0, d)), d \in N$ , is an I(d)process.
A unit root process with unit root structure $((ω_{1}, 1), \dots, (ω_{l}, 1))$ is an MFI(1), process.

A linearly regular stationary VARMA process has empty unit root structure

Ω_{0} : = {}

.

As discussed in Bauer and Wagner (2012) the state space framework is convenient for the analysis of VARMA unit root processes. Detailed treatments of the state space framework are given in Hannan and Deistler (1988) and—in the context of unit root processes—Bauer and Wagner (2012).

A state space representation of a unit root VARMA process is15

\begin{matrix} \begin{matrix} y_{t} & = & C x_{t} + Φ d_{t} + ε_{t}, \\ x_{t + 1} & = & A x_{t} + B ε_{t}, \end{matrix} \end{matrix}

(9)

for a white noise process

{ε_{t}}_{t \in Z}

,

ε_{t} \in R^{s}

, a deterministic process

{d_{t}}_{t \in Z}

,

d_{t} \in R^{m}

and the unobserved state process

{x_{t}}_{t \in Z}, x_{t} \in C^{n}

,

A \in C^{n \times n}

,

B \in C^{n \times s}

,

C \in C^{s \times n}

and

Φ \in R^{s \times m}

.

Remark 3.

Bauer and Wagner (2012, Theorem 2) show that every real valued unit root VARMA process

{y_{t}}_{t \in Z}

as given in (8) has a real valued state space representation with

{x_{t}}_{t \in Z}

real valued and real valued system matrices

(A, B, C)

. Considering complex valued state space representations in (9) is merely for algebraic convenience, as in general some eigenvalues of A are complex valued. Note for completeness that Bauer and Wagner (2012) contains a detailed discussion why considering the A-matrix in the canonical form in (up to reordering) the Jordan normal form is useful for cointegration analysis. For the sake of brevity we abstain from including this discussion again in the present paper. The key aspect of this construction is its usefulness for cointegration analysis, which becomes visible in Remark 4, where the “simple” unit root properties of blocks of the state vector are discussed.

The transfer function

k (z)

with real valued power series coefficients corresponding to a real valued unit root process

{y_{t}}_{t \in Z}

as given in Definition 1 is given by the rational matrix function

k (z) = Δ_{Ω} {(z)}^{- 1} a {(z)}^{- 1} b (z)

. The (possibly complex valued) matrix triple

(A, B, C)

realizes the transfer function

k (z)

if and only if

π (A, B, C) : = I_{s} + z C {(I_{n} - z A)}^{- 1} B = k (z)

. Please note that as for VARMA realizations, for a transfer function

k (z)

there exist multiple state space realizations

(A, B, C)

, with possibly different state dimensions n. A state space system

(A, B, C)

is minimal if there exists no state space system of lower state dimension realizing the same transfer function

k (z)

. The order of the transfer function

k (z)

is the state dimension of a minimal system

(A, B, C)

realizing

k (z)

.

All minimal state space realizations of a transfer function

k (z)

only differ in the basis of the state (cf. Hannan and Deistler 1988, Theorem 2.3.4), i.e.,

π (A, B, C) = π (\tilde{A}, \tilde{B}, \tilde{C})

for two minimal state space systems

(A, B, C)

and

(\tilde{A}, \tilde{B}, \tilde{C})

is equivalent to the existence of a regular matrix

T \in C^{n}

such that

A = T \tilde{A} T^{- 1}, B = T \tilde{B}, C = \tilde{C} T^{- 1}

. Thus, the matrices A and

\tilde{A}

are similar for all minimal realizations of a transfer function

k (z)

.

By imposing restrictions on the matrices of a minimal state space system

(A, B, C)

realizing

k (z)

, Bauer and Wagner (2012, Theorem 2) provide a canonical form, i.e., a mapping of the set

M_{n}

of transfer functions with real valued power series coefficients defined below onto unique state space realizations

(A, B, C)

. The set

M_{n}

is defined as

\begin{matrix} M_{n} & : = & \{k (z) = π (A, B, C) | \begin{matrix} λ_{| \max |} (A) \leq 1, \\ A \in R^{n \times n}, B \in R^{n \times s}, C \in R^{s \times n}, (A, B, C) minimal \end{matrix}\} . \end{matrix}

To describe the necessary restrictions of the canonical form the following definition is useful:

Definition 2.

A matrix

B = {[b_{i, j}]}_{i = 1, \dots, c, j = 1, \dots, s} \in C^{c \times s}

is positive upper triangular (p.u.t.) if there exist integers

1 \leq j_{1} \leq j_{2} \leq \dots \leq j_{c} \leq s

, such that for

j_{i} \leq s

we have

b_{i, j} = 0, j < j_{i}, j_{i} < j_{i + 1}

,

b_{i, j_{i}} \in R^{+}

; i.e., B is of the form

\begin{matrix} B & = & [\begin{matrix} 0 & \dots & 0 & b_{1, j_{1}} & * & \dots & * \\ 0 & \dots & 0 & b_{2, j_{2}} & * \\ 0 & \dots & 0 & b_{c, j_{c}} & * \end{matrix}], \end{matrix}

where the symbol * indicates unrestricted complex-valued entries.

A unique state space realization of

k (z) \in M_{n}

is given as follows (cf. Bauer and Wagner 2012, Theorem 2):

Theorem 1.

For every transfer function

k (z) \in M_{n}

there exists a unique minimal (complex) state space realization

(A, B, C)

such that

\begin{matrix} y_{t} & = & C x_{t, C} + Φ d_{t} + ε_{t}, \\ x_{t + 1, C} & = & A x_{t, C} + B ε_{t} \end{matrix}

with:

(i)

A : = d i a g (A_{u}, A_{•}) : = d i a g (A_{1, C}, \dots, A_{l, C}, A_{•})

,

A_{u} \in C^{n_{u} \times n_{u}}, A_{•} \in R^{n_{•} \times n_{•}},

where it holds for

k = 1, \dots, l

that

-: for $0 < ω_{k} < π$ :

$\begin{matrix} A_{k, C} & : = & [\begin{matrix} J_{k} & 0 \\ 0 & {\bar{J}}_{k} \end{matrix}] \in C^{2 d^{k} \times 2 d^{k}}, \end{matrix}$
-: for $ω_{k} \in {0, π}$ :

$\begin{matrix} A_{k, C} & : = & J_{k} \in R^{d^{k} \times d^{k}}, \end{matrix}$

with

\begin{matrix} \begin{matrix} \begin{matrix} J_{k} & : = & [\begin{matrix} \bar{z_{k}} I_{d_{1}^{k}} & [I_{d_{1}^{k}}, 0_{d_{1}^{k} \times (d_{2}^{k} - d_{1}^{k})}] & 0 & \dots & 0 \\ 0_{d_{2}^{k} \times d_{1}^{k}} & \bar{z_{k}} I_{d_{2}^{k}} & [I_{d_{2}^{k}}, 0_{d_{2}^{k} \times (d_{3}^{k} - d_{2}^{k})}] & 0 & ⋮ \\ 0 & 0 & \bar{z_{k}} I_{d_{3}^{k}} & ⋱ & 0 \\ ⋮ & ⋮ & ⋱ & ⋱ & [I_{d_{h_{k} - 1}^{k}}, 0_{d_{h_{k} - 1}^{k} \times (d_{h_{k}}^{k} - d_{h_{k} - 1}^{k})}] \\ 0 & 0 & \dots & 0 & \bar{z_{k}} I_{d_{h_{k}}^{k}} \end{matrix}], \end{matrix} \end{matrix} \end{matrix}

(10)

where

0 < d_{1}^{k} \leq d_{2}^{k} \leq \dots \leq d_{h_{k}}^{k}

.

(ii)

B : = {[B_{u}^{'}, B_{•}^{'}]}^{'} : = {[B_{1, C}^{'}, \dots, B_{l, C}^{'}, B_{•}^{'}]}^{'}

and

C : = [C_{u}, C_{•}] : = [C_{1, C}, \dots, C_{l, C}, C_{•}]

are partitioned accordingly. It holds for

k = 1, \dots, l

that

-: for $0 < ω_{k} < π$ :

$\begin{matrix} B_{k, C} : = [\begin{matrix} B_{k} \\ {\bar{B}}_{k} \end{matrix}] \in C^{2 d^{k} \times s} a n d C_{k, C} : = [C_{k}, {\bar{C}}_{k}] \in C^{s \times 2 d^{k}} . \end{matrix}$
-: for $ω_{k} \in {0, π}$ :

$\begin{matrix} B_{k, C} : = B_{k} \in R^{d^{k} \times s} a n d C_{k, C} : = C_{k} \in R^{s \times d^{k}} . \end{matrix}$

(iii)

Partitioning

B_{k, h_{k}}

in

B_{k} = {[B_{k, 1}^{'}, \dots, B_{k, h_{k}}^{'}]}^{'}

as

B_{k, h_{k}} = {[B_{k, h_{k}, 1}^{'}, \dots, B_{k, h_{k}, h_{k}}^{'}]}^{'}

, with

B_{k, h_{k}, j} \in C^{(d_{j}^{k} - d_{j - 1}^{k}) \times s}

it holds that

B_{k, h_{k}, j}

is p.u.t. for

d_{j}^{k} > d_{j - 1}^{k}

for

j = 1, \dots, h_{k}

and

k = 1, \dots, l

.

(iv)

For

k = 1, \dots, l

define

C_{k} = [C_{k, 1}, C_{k, 2}, \dots, C_{k, h_{k}}], C_{k, j} = [C_{k, j}^{G}, C_{k, j}^{E}]

, with

C_{k, j}^{E} \in C^{s \times (d_{j}^{k} - d_{j - 1}^{k})}

and

C_{k, j}^{G} \in C^{s \times d_{j - 1}^{k}}

for

j = 1, \dots, h_{k}

, with

d_{0}^{k} : = 0

. Furthermore, define

C_{k}^{E} : = [C_{k, 1}^{E}, \dots, C_{k, h_{k}}^{E}] \in C^{s \times d_{h_{k}}^{k}}

. It holds that

{(C_{k}^{E})}^{'} C_{k}^{E} = I_{d_{h_{k}}^{k}}

and

{(C_{k, j}^{G})}^{'} C_{k, i}^{E} = 0

for

1 \leq i \leq j

for

j = 2, \dots, h_{k}

and

k = 1, \dots, l

.

(v)

λ_{| \max |} (A_{•}) < 1

and the stable subsystem

(A_{•}, B_{•}, C_{•})

of state dimension

n_{•} = n - n_{u}

is in echelon canonical form (cf. Hannan and Deistler 1988, Theorem 2.5.2).

Remark 4.

As indicated in Remark 3 and discussed in detail in Bauer and Wagner (2012) considering complex valued quantities is merely for algebraic convenience. For econometric analysis, interest is, of course, on real valued quantities. These can be straightforwardly obtained from the representation given in Theorem 1 as follows. First define a transformation matrix (and its inverse):

\begin{matrix} T_{R, d} & : = & [I_{d} \otimes [\begin{matrix} 1 \\ i \end{matrix}], I_{d} \otimes [\begin{matrix} 1 \\ - i \end{matrix}]] \in C^{2 d \times 2 d}, T_{R, d}^{- 1} : = \frac{1}{2} [\begin{matrix} I_{d} \otimes [1, - i] \\ I_{d} \otimes [1, i] \end{matrix}] . \end{matrix}

Starting from the complex valued canonical representation

(A, B, C)

, a real valued canonical representation

\begin{matrix} y_{t} & = & C_{R} x_{t, R} + Φ d_{t} + ε_{t}, \\ x_{t + 1, R} & = & A_{R} x_{t, R} + B_{R} ε_{t}, \end{matrix}

with real valued matrices

(A_{R}, B_{R}, C_{R})

follows from using the just defined transformation matrix. In particular it holds that:

\begin{matrix} A_{R} : = & d i a g (A_{u, R}, A_{•}) & : = d i a g (A_{1, R}, \dots, A_{l, R}, A_{•}), \\ B_{R} : = & {[B_{u, R}^{'}, B_{•}^{'}]}^{'} & : = {[B_{1, R}^{'}, \dots, B_{l, R}^{'}, B_{•}^{'}]}^{'}, \\ C_{R} : = & [C_{u, R}, C_{•}] & : = [C_{1, R}, \dots, C_{l, R}, C_{•}], \end{matrix}

with

\begin{matrix} (A_{k, R}, B_{k, R}, C_{k, R}) & : = & \{\begin{matrix} (T_{R, d^{k}} A_{k} T_{R, d^{k}}^{- 1}, T_{R, d^{k}} B_{k}, C_{k} T_{R, d^{k}}^{- 1}) & if 0 < ω_{k} < π, \\ (A_{k}, B_{k}, C_{k}) & if ω_{k} \in {0, π} . \end{matrix} \end{matrix}

Before we turn to the real valued state process corresponding to the real valued canonical representation, we first consider the complex valued state process

{x_{t, C}}_{t \in Z}

in more detail. This process is partitioned according to the partitioning of the matrices

C_{k, C}

into

x_{t, C} : = {[x_{t, u}^{'}, x_{t, •}^{'}]}^{'} : = {[x_{t, 1, C}^{'}, \dots, x_{t, l, C}^{'}, x_{t, •}^{'}]}^{'}

, where

\begin{matrix} x_{t, k, C} & : = & \{\begin{matrix} {[x_{t, k}^{'}, {\bar{x}}_{t, k}^{'}]}^{'} & if 0 < ω_{k} < π, \\ x_{t, k} & if ω_{k} \in {0, π}, \end{matrix} \end{matrix}

with

\begin{matrix} x_{t + 1, k} = J_{k} x_{t, k} + B_{k} ε_{t}, for k = 1, \dots, l . \end{matrix}

For

k = 1, \dots, l

the sub-vectors

x_{t, k}

are further decomposed into

x_{t, k} : = {[{(x_{t, k}^{1})}^{'}, \dots, {(x_{t, k}^{h_{k}})}^{'}]}^{'}

, with

x_{t, k}^{j} \in C^{d_{j}^{k}}

for

j = 1, \dots, h_{k}

according to the partitioning

C_{k} = [C_{k, 1}, \dots, C_{k, h_{k}}]

.

The partitioning of the complex valued process

{x_{t, C}}_{t \in Z}

leads to an analogous partitioning of the real valued state process

{x_{t, R}}_{t \in Z}

,

x_{t, R} : = {[x_{t, u, R}^{'}, x_{t, •}^{'}]}^{'} : = {[x_{t, 1, R}^{'}, \dots, x_{t, l, R}^{'}, x_{t, •}^{'}]}^{'}

, obtained from

\begin{matrix} x_{t, k, R} & : = & \{\begin{matrix} T_{R, d^{k}} x_{t, k, C} & if 0 < ω_{k} < π, \\ x_{t, k} & if ω_{k} \in {0, π}, \end{matrix} \end{matrix}

with the corresponding block of the state equation given by

\begin{matrix} x_{t + 1, k, R} = A_{k, R} x_{t, k, R} + B_{k, R} ε_{t} . \end{matrix}

For

k = 1, \dots, l

the sub-vectors

x_{t, k, R}

are further decomposed into

x_{t, k, R} : = {[{(x_{t, k, R}^{1})}^{'}, \dots, {(x_{t, k, R}^{h_{k}})}^{'}]}^{'}

, with

x_{t, k, R}^{j} \in R^{2 d_{j}^{k}}

if

0 < ω_{k} < π

and

x_{t, k, R}^{j} \in R^{d_{j}^{k}}

if

ω_{k} \in {0, π}

for

j = 1, \dots, h_{k}

and

C_{k, R} : = [C_{k, 1, R}, \dots, C_{k, h_{k}, R}]

decomposed accordingly.

Bauer and Wagner (2012, Theorem 3, p. 1328) show that the processes

{x_{t, k, R}^{j}}_{t \in Z}

have unit root structure

((ω_{k}, h_{k} - j + 1))

for

j = 1, \dots, h_{k}

and

k = 1, \dots, l

. Furthermore, for

j = 1, \dots, h_{k}

and

k = 1, \dots, l

the processes

{x_{t, k, R}^{j}}_{t \in Z}

are not cointegrated, as defined in Definition 3 below. For

ω_{k} = 0

, the process

{x_{t, k, R}^{j}}_{t \in Z}

is the

d_{k}^{j}

-dimensional process of stochastic trends of order

h_{1} - j + 1

, while the

2 d_{j}^{k}

components of

{x_{t, k, R}^{j}}_{t \in Z}

, for

0 < ω_{k} < π

, and the

d_{j}^{k}

components of

{x_{t, l, R}^{j}}_{t \in Z}

, for

ω_{k} = π

, are referred to as stochastic cycles of order

h_{k} - j + 1

at their corresponding frequencies

ω_{k}

.

Remark 5.

Parameterizing the stable part of the transfer function using the echelon canonical form is merely one possible choice. Any other canonical form of the stable subsystem and suitable parameterization based on it can be used instead for the stable subsystem.

Remark 6.

Starting from a state space system (9) with matrices

(A, B, C)

in canonical form, a solution for

y_{t}, t > 0

(with the solution for

t < 0

obtained completely analogously)—for some

x_{1} = {[x_{1, u}^{'}, x_{1, •}^{'}]}^{'}

—is given by

\begin{matrix} y_{t} = \sum_{j = 1}^{t - 1} C_{u} A_{u}^{j - 1} B_{u} ε_{t - j} + C_{u} A_{u}^{t - 1} x_{1, u} + \sum_{j = 1}^{t - 1} C_{•} A_{•}^{j - 1} B_{•} ε_{t - j} + C_{•} A_{•}^{t - 1} x_{1, •} + Φ d_{t} + ε_{t} . \end{matrix}

Clearly, the term

C_{u} A_{u}^{t - 1} x_{1, u}

is stochastically singular and is effectively like a deterministic component, which may lead to an identification problem with

Φ d_{t}

. If, the deterministic component

Φ d_{t}

is rich enough to “absorb”

C_{u} A_{u}^{t - 1} x_{1, u}

, then one solution of the identification problem is to set

x_{1, u} = 0

. Rich enough here means, e.g., in the I(1) case with

A_{u} = I

that

d_{t}

contains an intercept. Analogously, in the MFI(1) case

d_{t}

has to contain seasonal dummy variables corresponding to all unit root frequencies. The term

C_{•} A_{•}^{t - 1} x_{1, •}

decays exponentially and, therefore, does not impact the asymptotic properties of any statistical procedure. It is, therefore, inconsequential for statistical analysis but convenient (with respect to our definition of unit root processes) to set

x_{1, •} = \sum_{j = 1}^{\infty} A_{•}^{j - 1} B_{•} ε_{1 - j}

. This corresponds to the steady state or stationary solution of the stable block of the state equation, and renders

{x_{t, •}}_{t \in N}

or, when the solution on

Z

is considered,

{x_{t, •}}_{t \in Z}

stationary. Please note that these issues with respect to starting values, potential identification problems and their impact or non-impact on statistical procedures also occur in the VAR setting.

Bauer and Wagner (2012, Theorem 2) show that minimality of the canonical state space realization

(A, B, C)

implies full row rank of the p.u.t. blocks

B_{k, h_{k}, j}

of

B_{k, h_{k}}

. In addition to proposing the canonical form, Bauer and Wagner (2012) also provide details how to transform any minimal state space realization into canonical form: Given a minimal state space system

(A, B, C)

realizing the transfer function

k (z) \in M_{n}

, the first step is to find a similarity transformation T such that

\tilde{A} = T A T^{- 1}

is of the form given in (10) by using an eigenvalue decomposition, compare Chatelin (1993). In the second step the corresponding subsystem

({\tilde{A}}_{•}, {\tilde{B}}_{•}, {\tilde{C}}_{•})

is transformed to echelon canonical form as described in Hannan and Deistler (1988, chp. 2). These two transformations do not lead to a unique realization, because the restrictions on

A

do not uniquely determine the unstable subsystem

(A_{u}, B_{u}, C_{u})

.

For example, in the case

Ω = ((ω_{1}, h_{1})) = ((0, 1))

,

n_{•} = 0

,

d_{1}^{1} < s

, such that

(I_{d_{1}^{1}}, B_{1}, C_{1})

is a corresponding state space system, the same transfer function

k (z) = I_{s} + z C_{1} {(1 - z)}^{- 1} B_{1} = I_{s} + C_{1} B_{1} z {(1 - z)}^{- 1}

is realized also by all systems

(I_{d_{1}^{1}}, T B_{1}, C_{1} T^{- 1})

, with some regular matrix

T \in C^{d_{1}^{1} \times d_{1}^{1}}

. To find a unique realization the product

C_{1} B_{1}

needs to be uniquely decomposed into factors

C_{1}

and

B_{1}

. This is achieved by performing a QR decomposition of

C_{1} B_{1}

(without pivoting) that leads to

C_{1}^{'} C_{1} = I

. The additional restriction of

B_{1}

being a p.u.t. matrix of full row rank then leads to a unique factorization of

C_{1} B_{1}

into

C_{1}

and

B_{1}

. In the general case with an arbitrary unit root structure

Ω

, similar arguments lead to p.u.t. restrictions on sub-blocks

B_{k, h_{k}, j}

in

B_{u}

and orthogonality restrictions on sub-blocks of

C_{u}

.

The canonical form introduced in Theorem 1 was designed to be useful for cointegration analysis. To see this, first requires a definition of static and polynomial cointegration (cf. Bauer and Wagner 2012, Definitions 3 and 4).

Definition 3.

(i)

Let

\tilde{Ω} = (({\tilde{ω}}_{1}, {\tilde{h}}_{1}), \dots, ({\tilde{ω}}_{\tilde{l}}, {\tilde{h}}_{\tilde{l}}))

and

Ω = ((ω_{1}, h_{1}), \dots, (ω_{l}, h_{l}))

be two unit root structures. Then

\tilde{Ω} ⪯ Ω

if

-: $F (\tilde{Ω}) \subseteq F (Ω)$ .
-: For all $ω \in F (\tilde{Ω})$ for $\tilde{k}$ and k such that ${\tilde{ω}}_{\tilde{k}} = ω_{k} = ω$ it holds that ${\tilde{h}}_{\tilde{k}} \leq h_{k}$ .

Furthermore,

\tilde{Ω} ≺ Ω

if

\tilde{Ω} ⪯ Ω

and

\tilde{Ω} \neq Ω

. For two unit root structures

\tilde{Ω} ⪯ Ω

define the decrease

δ_{k} (Ω, \tilde{Ω})

of the integration order at frequency

ω_{k}

, for

k = 1, \dots, l

, as

\begin{matrix} δ_{k} (Ω, \tilde{Ω}) & : = & \{\begin{matrix} h_{k} - {\tilde{h}}_{\tilde{k}} & \exists \tilde{k} : {\tilde{ω}}_{\tilde{k}} = ω_{k} \in F (\tilde{Ω}), \\ h_{k} & ω_{k} \notin F (\tilde{Ω}) \end{matrix} . \end{matrix}

(ii)

An s-dimensional unit root process

{y_{t}}_{t \in Z}

with unit root structure Ω is cointegrated of order

(Ω, \tilde{Ω})

, where

\tilde{Ω} ≺ Ω

, if there exists a vector

β \in R^{s}, β \neq 0

, such that

{β^{'} y_{t}}_{t \in Z}

has unit root structure

\tilde{Ω}

. In this case the vector β is a cointegrating vector (CIV) of order

(Ω, \tilde{Ω})

.

(iii)

All CIVs of order

(Ω, \tilde{Ω})

span the (static) cointegrating space of order

(Ω, \tilde{Ω})

.16

(iv)

An s-dimensional unit root process

{y_{t}}_{t \in Z}

with unit root structure Ω is polynomially cointegrated of order

(Ω, \tilde{Ω})

, where

\tilde{Ω} ≺ Ω

, if there exists a vector polynomial

β (z) = \sum_{m = 0}^{q} β_{m} z^{m}, β_{m} \in R^{s}, m = 0, \dots, q, β_{q} \neq 0

, for some integer

1 \leq q < \infty

such that

-: $β {(L)}^{'} ({y_{t}}_{t \in Z})$ has unit root structure $\tilde{Ω}$ ,
-: $\max_{k = 1, \dots, l} ∥ β (e^{i ω_{k}}) ∥ δ_{k} (Ω, \tilde{Ω}) \neq 0$ .

In this case the vector polynomial

β (z)

is a polynomial cointegrating vector (PCIV) of order

(Ω, \tilde{Ω})

.

(v)

All PCIVs of order

(Ω, \tilde{Ω})

span the polynomial cointegrating space of order

(Ω, \tilde{Ω})

.

Remark 7.

(i): It is merely a matter of taste whether cointegrating spaces are defined in terms of their order $(Ω, \tilde{Ω})$ or their decrease $δ (Ω, \tilde{Ω}) : = (δ_{1} (Ω, \tilde{Ω}), \dots, δ_{l} (Ω, \tilde{Ω}))$ , with $δ_{k} (Ω, \tilde{Ω})$ as defined above. Specifying Ω and $δ (Ω, \tilde{Ω})$ contains the same information as providing the order of (polynomial) cointegration.
(ii): Notwithstanding the fact that CIVs and PCIVs in general may lead to changes of the integration orders at different unit root frequencies it may be of interest to “zoom in” on only one unit root frequency $ω_{k}$ , thereby leaving the potential reductions of the integration orders at other unit root frequencies unspecified. This allows to—entirely similarly as in Definition 3—define cointegrating and polynomial cointegrating spaces of different orders at a single unit root frequency $ω_{k}$ . Analogously one can also define cointegrating and polynomial cointegrating spaces of different orders for subsets of the frequencies in $F (Ω)$ .
(iii): In principle the polynomial cointegrating spaces defined so far are infinite-dimensional as the polynomial degree is not bounded. However, since every polynomial vector $β (z)$ can be written as $β_{0} (z) + β_{Ω} (z) Δ_{Ω} (z)$ , where by definition ${Δ_{Ω} y_{t}}_{t \in Z}$ has empty unit root structure, it suffices to consider PCIVs of polynomial degree smaller than the polynomial degree of $Δ_{Ω} (z)$ . This shows that it is sufficient to consider finite dimensional polynomial cointegrating spaces. When considering, as in item (ii), (polynomial) cointegration only for one unit root it similarly suffices to consider polynomials of maximal degree equal to $h_{k} - 1$ for real unit roots and $2 h_{k} - 1$ for complex unit roots. Thus, in the I(2) case it suffices to consider polynomials of degree one.
(iv): The argument about maximal relevant polynomial degrees given in item (iii) can be made more precise and combined with the decrease in Ω achieved. Every polynomial vector $β (z)$ can be written as $β_{0} (z) + β_{ω_{k}, δ_{k}} (z) Δ_{ω_{k}}^{δ_{k}} (z)$ for $δ_{k} = 1, \dots, h_{k}$ . By definition it holds that ${Δ_{ω_{k}}^{δ_{k}} y_{t}}_{t \in Z}$ has integration order $h_{k} - δ_{k}$ at frequency $ω_{k}$ . Thus, it suffices to consider PCIVs of polynomial degree smaller than $δ_{k}$ for $ω_{k} \in {0, π}$ or $2 δ_{k}$ for $0 < ω_{k} < π$ when considering the polynomial cointegrating space at $ω_{k}$ with decrease $δ_{k}$ . In the MFI(1) case therefore, when considering only one unit root frequency, again only polynomials of degree one need to be considered. This space is often referred to in the literature as dynamic cointegration space.

To illustrate the advantages of the canonical form for cointegration analysis consider

\begin{matrix} y_{t} & = & \sum_{k = 1}^{l} \sum_{j = 1}^{h_{k}} C_{k, j, R} x_{t, k, R}^{j} + C_{•} x_{t, •} + Φ d_{t} + ε_{t} . \end{matrix}

By Remark 4, the process

{x_{t, k, R}^{j}}_{t \in Z}

is not cointegrated. This implies that

β \in R^{s}, β \neq 0,

reduces the integration order at unit root

z_{k}

to

h_{k} - j

if and only if

β^{'} [C_{k, 1, R}, \dots, C_{k, j, R}] = 0

and

β^{'} C_{k, j + 1, R} \neq 0

or equivalently

β^{'} [C_{k, 1}, \dots, C_{k, j}] = 0

and

β^{'} C_{k, j + 1} \neq 0

(using the transformation to the complex matrices of the canonical form, as discussed in Remark 4, and that

β^{'} [C_{k}, \bar{C_{k}}] = 0

if and only if

β^{'} C_{k} = 0

). Thus, the CIVs are characterized by orthogonality to sub-blocks of

C_{u}

.

The real valued representation given in Remark 4 used in its partitioned form just above immediately leads to necessary orthogonality constraint for polynomial cointegration of degree one:

\begin{matrix} β {(L)}^{'} (y_{t}) & = & β {(L)}^{'} (C_{u, R} x_{t, u, R} + C_{•} x_{t, •} + Φ d_{t} + ε_{t}) \\ = & β_{0}^{'} C_{u, R} x_{t, u, R} + β_{1}^{'} C_{u, R} x_{t - 1, u, R} + β {(L)}^{'} (C_{•} x_{t, •} + Φ d_{t} + ε_{t}) \\ = & β_{0}^{'} C_{u, R} (A_{u, R} x_{t - 1, u, R} + B_{u, R} ε_{t - 1}) + β_{1}^{'} C_{u, R} x_{t - 1, u, R} + β {(L)}^{'} (C_{•} x_{t, •} + Φ d_{t} + ε_{t}) \\ = & (β_{0}^{'} C_{u, R} A_{u, R} + β_{1}^{'} C_{u, R}) x_{t - 1, u, R} + β_{0}^{'} C_{u, R} B_{u, R} ε_{t - 1} + β {(L)}^{'} (C_{•} x_{t, •} + Φ d_{t} + ε_{t}) \\ = & (β_{0}^{'} C_{u} A_{u} + β_{1}^{'} C_{u}) x_{t - 1, u} + β_{0}^{'} C_{u} B_{u} ε_{t - 1} + β {(L)}^{'} (C_{•} x_{t, •} + Φ d_{t} + ε_{t}) \end{matrix}

follows. Since all terms except the first are stationary or deterministic, a necessary condition for a reduction of the unit root structure is the orthogonality of

{[\begin{matrix} β_{0}^{'} & β_{1}^{'} \end{matrix}]}^{'}

to sub-blocks of

\begin{matrix} [\begin{matrix} C_{u, R} A_{u, R} \\ C_{u, R} \end{matrix}] \end{matrix}

or sub-blocks of the complex matrix

\begin{matrix} [\begin{matrix} C_{u} A_{u} \\ C_{u} \end{matrix}] \end{matrix}

. Please note, however, that this orthogonality condition is not sufficient for

{[β_{0}^{'}, β_{1}^{'}]}^{'}

to be a PCIV, because it does not imply

\max_{k = 1, \dots, l} ∥ β (e^{i ω_{k}}) ∥ δ_{k} (Ω, \tilde{Ω}) \neq 0

. For a detailed discussion of polynomial cointegration, when considering also higher polynomial degrees, see Bauer and Wagner (2012, sct. 5).

The following examples illustrate cointegration analysis in the state space framework for the empirically most relevant, i.e., the I(1), MFI(1) and I(2) cases.

Example 1 (Cointegration in the I(1) case).

In the I(1) case, neglecting the stable subsystem and the deterministic components for simplicity, it holds that

\begin{matrix} y_{t} & = & C_{1} x_{t, 1} + ε_{t}, y_{t}, ε_{t} \in R^{s}, x_{t, 1} \in R^{d_{1}^{1}}, C_{1} \in R^{s \times d_{1}^{1}}, \\ x_{t + 1, 1} & = & x_{t, 1} + B_{1} ε_{t}, B_{1} \in R^{d_{1}^{1} \times s} . \end{matrix}

The vector

β \in R^{s}, β \neq 0

, is a CIV of order

((0, 1), {})

if and only if

β^{'} C_{1} = 0

.

Example 2 (Cointegration in the MFI(1) case with complex unit root

z_{k}

).

In the MFI(1) case with unit root structure

Ω = ((ω_{k}, 1))

and complex unit root

z_{k}

, neglecting the stable subsystem and the deterministic components for simplicity, it holds that

\begin{matrix} y_{t} & = & C_{k, R} x_{t, k, R} + ε_{t} \\ = & [\begin{matrix} C_{k} & {\bar{C}}_{k} \end{matrix}] \begin{matrix} [\begin{matrix} x_{t, k} \\ {\bar{x}}_{t, k} \end{matrix}] \end{matrix} + ε_{t}, \\ y_{t}, ε_{t} \in R^{s}, x_{t, k, R} \in R^{2 d_{1}^{k}}, x_{t, k} \in C^{d_{1}^{k}}, C_{k, R} \in R^{s \times 2 d_{1}^{k}}, C_{k} \in C^{s \times d_{1}^{k}}, \\ \begin{matrix} [\begin{matrix} x_{t + 1, k} \\ {\bar{x}}_{t + 1, k} \end{matrix}] \end{matrix} & = & \begin{matrix} [\begin{matrix} {\bar{z}}_{k} I_{d_{1}^{k}} & 0 \\ 0 & z_{k} I_{d_{1}^{k}} \end{matrix}] \end{matrix} \begin{matrix} [\begin{matrix} x_{t, k} \\ {\bar{x}}_{t, k} \end{matrix}] \end{matrix} + \begin{matrix} [\begin{matrix} B_{k} \\ {\bar{B}}_{k} \end{matrix}] \end{matrix} ε_{t}, B_{k} \in C^{d_{1}^{k} \times s} . \end{matrix}

The vector

β \in R^{s}, β \neq 0,

is a CIV of order

(Ω, {})

if and only if

\begin{matrix} β^{'} C_{k} = 0 (and thus β^{'} {\bar{C}}_{k} = 0) . \end{matrix}

The vector polynomial

β (z) = β_{0} + β_{1} z

, with

β_{0}, β_{1} \in R^{s}, {[β_{0}^{'}, β_{1}^{'}]}^{'} \neq 0,

is a PCIV of order

(Ω, {})

if and only if

\begin{matrix} [β_{0}^{'}, β_{1}^{'}] [\begin{matrix} {\bar{z}}_{k} C_{k} & z_{k} {\bar{C}}_{k} \\ C_{k} & {\bar{C}}_{k} \end{matrix}] = 0, \end{matrix}

(11)

which is equivalent to

\begin{matrix} ({\bar{z}}_{k} β_{0}^{'} + β_{1}^{'}) C_{k} = 0 . \end{matrix}

The fact that the matrix in (11) has a block structure with two blocks of conjugate complex columns implies some additional structure also on the space of PCIVs, here with polynomial degree one. More specifically it holds that if

β_{0} + β_{1} z

is a PCIV of order

(Ω, {})

, also

- β_{1} + (β_{0} + 2 cos (ω_{k}) β_{1}) z

is a PCIV of order

(Ω, {})

. This follows from

\begin{matrix} ({\bar{z}}_{k} {(- β_{1})}^{'} + {(β_{0} + 2 cos (ω_{k}) β_{1})}^{'}) C_{k} & = & (β_{0}^{'} + (2 R (z_{k}) - {\bar{z}}_{k}) β_{1}^{'}) C_{k} \\ = & (β_{0}^{'} + z_{k} β_{1}^{'}) C_{k} \\ = & z_{k} ({\bar{z}}_{k} β_{0}^{'} + β_{1}^{'}) C_{k} = 0 . \end{matrix}

Thus, the space of PCIVs of degree (up to) one inherits some additional structure emanating from the occurrence of complex eigenvalues in complex conjugate pairs.

Example 3 (Cointegration in the I(2) case).

In the I(2) case, neglecting the stable subsystem and the deterministic components for simplicity, it holds that

\begin{matrix} y_{t} & = & C_{1, 1}^{E} x_{t, 1}^{E} + C_{1, 2}^{G} x_{t, 2}^{G} + C_{1, 2}^{E} x_{t, 2}^{E} + ε_{t}, \\ y_{t}, ε_{t} \in R^{s}, x_{t, 1}^{E}, x_{t, 2}^{G} \in R^{d_{1}^{1}}, x_{t, 2}^{E} \in R^{d_{2}^{1} - d_{1}^{1}}, C_{1, 1}^{E}, C_{1, 2}^{G} \in R^{s \times d_{1}^{1}}, C_{1, 2}^{E} \in R^{s \times (d_{2}^{1} - d_{1}^{1})}, \\ x_{t + 1, 1}^{E} & = & x_{t, 1}^{E} + x_{t, 2}^{G} + B_{1, 1} ε_{t}, \\ x_{t + 1, 2}^{G} & = & x_{t, 2}^{G} + B_{1, 2, 1} ε_{t}, \\ x_{t + 1, 2}^{E} & = & x_{t, 2}^{E} + B_{1, 2, 2} ε_{t}, B_{1, 1} \in R^{d_{1}^{1} \times s}, B_{1, 2, 1} \in R^{d_{1}^{1} \times s}, B_{1, 2, 2} \in R^{(d_{2}^{1} - d_{1}^{1}) \times s} . \end{matrix}

The vector

β \in R^{s}, β \neq 0

is a CIV of order

((0, 2), (0, 1))

if and only if

\begin{matrix} β^{'} C_{1, 1}^{E} = 0 and β^{'} [C_{1, 2}^{G}, C_{1, 2}^{E}] \neq 0 . \end{matrix}

The vector

β \in R^{s}, β \neq 0,

is a CIV of order

((0, 2), {})

if and only if

\begin{matrix} β^{'} [C_{1, 1}^{E}, C_{1, 2}^{G}, C_{1, 2}^{E}] = 0 . \end{matrix}

The vector polynomial

β (z) = β_{0} + β_{1} z

, with

β_{0}, β_{1} \in R^{s}

is a PCIV of order

((0, 2), {})

if and only if

\begin{matrix} [β_{0}^{'}, β_{1}^{'}] [\begin{matrix} C_{1, 1}^{E} & C_{1, 1}^{E} + C_{1, 2}^{G} & C_{1, 2}^{E} \\ C_{1, 1}^{E} & C_{1, 2}^{G} & C_{1, 2}^{E} \end{matrix}] = 0 and β (1) = β_{0} + β_{1} \neq 0 . \end{matrix}

The above orthogonality constraint indicates that the two cases

C_{1, 2}^{G} = 0

and

C_{1, 2}^{G} \neq 0

have to be considered separately for polynomial cointegration analysis. Consider first the case

C_{1, 2}^{G} = 0

. In this case the orthogonality constraints imply

β_{0}^{'} C_{1, 1}^{E} = 0

,

β_{1}^{'} C_{1, 1}^{E} = 0

and

{(β_{0} + β_{1})}^{'} C_{1, 2}^{E} = 0

. Thus, the vector

β_{0} + β_{1}

is a CIV of order

((0, 2), {})

and therefore

β (z) = β_{0} + β_{1} z

is of “non-minimum” degree, one in this case rather than zero (

β_{0} + β_{1}

). For a formal definition of minimum degree PCIVs see Bauer and Wagner (2003, Definition 4). In case

C_{1, 2}^{G} \neq 0

there are PCIVs of degree one that are not simple transformations of static CIVs. Consider

β (z) = β_{0} + β_{1} z = γ_{1} (1 - z) + γ_{2}

such that

{γ_{1}^{'} (y_{t} - y_{t - 1}) + γ_{2}^{'} y_{t}}_{t \in Z}

is stationary. The integrated contribution to

{γ_{1}^{'} (y_{t} - y_{t - 1})}_{t \in Z}

is given by

γ_{1}^{'} (1 - L) ({C_{1, 1}^{E} x_{t, 1}^{E}}_{t \in Z}) = {γ_{1}^{'} C_{1, 1}^{E} x_{t - 1, 2}^{G} + γ_{1}^{'} C_{1, 1}^{E} B_{1, 1} ε_{t - 1}}_{t \in Z}

, with

γ_{1}^{'} C_{1, 1}^{E} \neq 0

. This term is eliminated by

{γ_{2}^{'} C_{1, 2}^{G} x_{t, 2}^{G}}_{t \in Z}

in

{γ_{2}^{'} y_{t}}_{t \in Z}

, if

γ_{1}^{'} C_{1, 1}^{E} + γ_{2}^{'} C_{1, 2}^{G} = 0

, which is only possible if

C_{1, 2}^{G} \neq 0

. Additionally,

γ_{2}^{'} [C_{1, 1}^{E}, C_{1, 2}^{E}] = 0

needs to hold, such that there is no further integrated contribution to

{γ_{2}^{'} y_{t}}_{t \in Z}

. Neither

γ_{1}

nor

γ_{2}

are CIVs since both violate the necessary conditions given in the definition of CIVs, which implies that

β (z)

is indeed a “minimum degree” PCIV.

As was shown above, the unit root and cointegration properties of

{y_{t}}_{t \in Z}

depend on the sub-blocks of

C_{u}

and the eigenvalue structure of

A_{u}

. We, therefore, define the more encompassing state space unit root structure containing information on the geometrical and algebraic multiplicities of the eigenvalues of

A_{u}

(cf. Bauer and Wagner 2012, Definition 2).

Definition 4.

A unit root process

{y_{t}}_{t \in Z}

with a canonical state space representation as given in Theorem 1 has state space unit root structure

\begin{matrix} Ω_{S} & : = & ((ω_{1}, d_{1}^{1}, \dots, d_{h_{1}}^{1}), \dots, (ω_{l}, d_{1}^{l}, \dots, d_{h_{l}}^{l})) \end{matrix}

where

0 \leq d_{1}^{k} \leq d_{2}^{k} \leq \dots \leq d_{h_{k}}^{k} \leq s

for

k = 1, \dots, l

. For

{y_{t}}_{t \in Z}

with empty unit root structure

Ω_{S} : = {}

.

Remark 8.

The state space unit root structure

Ω_{S}

contains information concerning the integration properties of the process

{y_{t}}_{t \in Z}

, since the integers

d_{j}^{k}

,

k = 1, \dots, l

,

j = 1, \dots, h_{k}

describe (multiplied by two for k such that

0 < ω_{k} < π)

the numbers of non-cointegrated stochastic trends or cycles of corresponding integration orders, compare again Remark 4. As such,

Ω_{S}

describes properties of the stochastic process

{y_{t}}_{t \in Z}

—and, therefore, the state space unit root structure

Ω_{S}

partitions unit root processes according to these (co-)integration properties. These (co-)integration properties, however, are invariant to a chosen canonical representation, or more generally invariant to whether a VARMA or state space representation is considered. For all minimal state representations of a unit root process

{y_{t}}_{t \in Z}

these indices—being related to the Jordan normal form—are invariant.

As mentioned in Section 2, Paruolo (1996, Definition 3) introduces integration indices at frequency zero as a triple of integers

(r_{0}, r_{1}, r_{2})

. These correspond to the numbers of columns of the matrices

β, β_{1}, β_{2}

in the error correction representation of I(2) VAR processes, see, e.g., Johansen (1997, sct. 3). Here,

r_{2}

is the number of stochastic trends of order two, i.e.,

r_{2} = d_{1}^{1}

. Furthermore,

r_{1}

is the number of stochastic trends of order one that do not cointegrate with

β_{2}^{'} Δ_{0} {y_{t}}_{t \in Z}

and hence

r_{1} = d_{2}^{1} - d_{1}^{1}

. Therefore, the integration indices at frequency zero are in one-one correspondence with the state space unit root structure

Ω_{S} = ((0, d_{1}^{1}, d_{2}^{1}))

for I(2) processes and the dimension

s = r_{0} + r_{1} + r_{2}

of the process.

The canonical form given in Theorem 1 imposes p.u.t. structures on sub-blocks of the matrix

B_{u}

. The occurrence of these blocks—related to

d_{j}^{k} > d_{j - 1}^{k}

—is determined by the state space unit root structure

Ω_{S}

. The number of free entries in these p.u.t.-blocks, however, is not determined by

Ω_{S}

. Consequently, we need structure indices

p \in N_{0}^{n_{u}}

indicating for each row the position of a potentially restricted positive element, as formalized below:

Definition 5 (Structure indices).

For the block

B_{u} \in C^{n_{u} \times s}

of the matrix

B

of a state space realization

(A, B, C)

in canonical form, define the corresponding structure indices

p \in N_{0}^{n_{u}}

as

\begin{matrix} p_{i} & : = & \{\begin{matrix} 0 & if the i - th row of B_{u} is not part of a p . u . t . block, \\ j & if the i - th row of B_{u} is part of a p . u . t . block and its j - th entry is restricted to be positive . \end{matrix} \end{matrix}

Remark 9.

Since sub-blocks of

B_{u}

corresponding to complex unit roots are of the form

B_{k, C} = {[B_{k}^{'}, {\bar{B}}_{k}^{'}]}^{'}

, the entries restricted to be positive are located in the same columns and rows of both

B_{k}

and

{\bar{B}}_{k}

. Thus, the structure indices

p_{i}

of the corresponding rows are identical for

B_{k}

and

{\bar{B}}_{k}

. Therefore, it would be possible to omit the parts of p corresponding to the blocks

{\bar{B}}_{k}

. It is, however, as will be seen in Definition 9, advantageous for the comparison of unit root structures and structure indices that p is a vector with

n_{u}

entries.

Example 4.

Consider the following state space system:

\begin{matrix} y_{t} & = & \begin{matrix} [\begin{matrix} C_{1, 1}^{E} & C_{1, 2}^{G} & C_{1, 2}^{E} \end{matrix}] \end{matrix} x_{t} + ε_{t} y_{t}, ε_{t} \in R^{2}, x_{t} \in R^{3}, C_{1, 1}^{E}, C_{1, 2}^{G}, C_{1, 2}^{E} \in R^{2 \times 1} \\ x_{t + 1} & = & \begin{matrix} [\begin{matrix} 1 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}] \end{matrix} x_{t} + \begin{matrix} [\begin{matrix} B_{1, 1} \\ B_{1, 2, 1} \\ B_{1, 2, 2} \end{matrix}] \end{matrix} ε_{t}, x_{0} = 0, B_{1, 1}, B_{1, 2, 1}, B_{1, 2, 2} \in R^{1 \times 2} . \end{matrix}

(12)

In canonical form

B_{1, 2, 1}

and

B_{1, 2, 2}

are p.u.t. matrices and

B_{1, 1}

is unrestricted. If, e.g., the second entry

b_{1, 2, 1, 2}

of

B_{1, 2, 1}

and the first entry

b_{1, 2, 2, 1}

of

B_{1, 2, 2}

are restricted to be positive, then

\begin{matrix} B & = & \begin{matrix} [\begin{matrix} * & * \\ 0 & b_{1, 2, 1, 2} \\ b_{1, 2, 2, 1} & * \end{matrix}] \end{matrix}, \end{matrix}

where the symbol * denotes unrestricted entries. In this case

p = {[0, 2, 1]}^{'}

.

For given state space unit root structure

Ω_{S}

the matrix

A_{u}

is fully determined. The parameterization of the set of feasible matrices

B_{u}

for given structure indices p and of the set of stable subsystems

(A_{•}, B_{•}, C_{•})

for given Kronecker indices

α_{•}

(cf. Hannan and Deistler 1988, chp. 2.) is straightforward, since the entries in these matrices are either unrestricted, restricted to zero or restricted to be positive. Matters are a bit more complicated for

C_{u}

. One possibility to parameterize the set of possible matrices

C_{u}

for a given state space unit root structure

Ω_{S}

is to use real and complex valued Givens rotations (cf. Golub and van Loan 1996, chp. 5.1).

Definition 6 (Real Givens rotation).

The real Givens rotation

R_{q, i, j} (θ) \in R^{q \times q}

,

θ \in [0, 2 π)

is defined as

\begin{matrix} R_{q, i, j} (θ) & : = & [\begin{matrix} I_{i - 1} & 0 \\ cos (θ) & 0 & sin (θ) \\ 0 & I_{j - 1 - i} & 0 \\ - sin (θ) & 0 & cos (θ) \\ 0 & I_{q - j} \end{matrix}] . \end{matrix}

Remark 10.

Givens rotations allow transforming any vector

v = {[v_{1}, v_{2}, . . ., v_{q}]}^{'} \in R^{q}

into a vector of the form

{[{\tilde{v}}_{1}, 0, . . ., 0]}^{'}

with

{\tilde{v}}_{1} \geq 0

. This is achieved by the following algorithm:

1.: Set $j = 1$ , $v_{1}^{(1)} = v_{1}$ and $v^{(1)} = v$ .
2.: Represent ${[v_{1}^{(j)}, v_{q - j + 1}]}^{'}$ using polar coordinates as ${[v_{1}^{(j)}, v_{q - j + 1}]}^{'} = {[r_{j} cos (θ_{q - j}), r_{j} sin (θ_{q - j})]}^{'}$ , with $r_{j} \geq 0$ and $θ_{q - j} \in [0, 2 π)$ . If $r_{j} = 0$ , set $θ_{q - j} = 0$ (cf. Otto 2011, chp. 1.5.3, p. 39). Then $R_{2, 1, 2} (θ_{q - j}) {[v_{1}^{(j)}, v_{q - j + 1}]}^{'} = {[v_{1}^{(j + 1)}, 0]}^{'}$ such that $v^{(j + 1)} = R_{q, 1, q - j + 1} (θ_{q - j}) v^{(j)} = {[v_{1}^{(j + 1)}, v_{2}, \dots, v_{q - j}, 0, \dots, 0]}^{'}$ , with $v_{1}^{(j + 1)} \geq 0$ .
3.: If $j = q - 1$ , stop. Else increment j by one ( $j \to j + 1$ ) and continue at step 2.

This algorithm determines a unique vector

θ = {[θ_{1}, . . ., θ_{q - 1}]}^{'}

for every vector

v \in R^{q}

.

Remark 11.

The determinant of real Givens rotations is equal to one, i.e.,

det (R_{s, i, j} (θ)) = 1

for all

s, i, j \in N

and all

θ \in [0, 2 π)

. Thus, it is not possible to factorize an orthonormal matrix Q with

det (Q) = - 1

into a product of Givens rotations. This obvious fact has implications for the parameterization of

C

-matrices as is detailed below.

Definition 7 (Complex Givens rotation).

The complex Givens rotation

Q_{q, i, j} (φ) \in C^{q \times q}

,

φ : = {[φ_{1}, φ_{2}]}^{'} \in Θ_{C} : = [0, π / 2] \times [0, 2 π)

, is defined as

\begin{matrix} Q_{q, i, j} (φ) & : = & [\begin{matrix} I_{i - 1} & 0 \\ cos (φ_{1}) & 0 & sin (φ_{1}) e^{i φ_{2}} \\ 0 & I_{j - 1 - i} & 0 \\ - sin (φ_{1}) e^{- i φ_{2}} & 0 & cos (φ_{1}) \\ 0 & I_{q - j} \end{matrix}] . \end{matrix}

Remark 12.

Complex Givens rotations allow transforming any vector

v = {[v_{1}, v_{2}, . . ., v_{q}]}^{'} \in C^{q}

into a vector of the form

{[{\tilde{v}}_{1}, 0, . . ., 0]}^{'}

with

{\tilde{v}}_{1} \in C

. This is achieved by the following algorithm:

1.: Set $j = 1$ , $v_{1}^{(1)} = v_{1}$ and $v^{(1)} = v$ .
2.: Represent ${[v_{1}^{(j)}, v_{q - j + 1}]}^{'}$ using polar coordinates as ${[v_{1}^{(j)}, v_{q - j + 1}]}^{'} = {[a_{j} e^{i φ_{a, j}}, b_{j} e^{i φ_{b, j}}]}^{'}$ , with $a_{j}, b_{j} \geq 0$ and $φ_{a, j}, φ_{b, j} \in [0, 2 π)$ . If $v_{1}^{(j)} = 0$ , set $φ_{a, j} = 0$ and if $v_{q - j + 1} = 0$ , set $φ_{b, j} = 0$ (cf. Otto 2011, chp. 8.1.3, p. 222).
3.: Set

$\begin{matrix} φ_{q - j, 1} & = & \{\begin{matrix} \tan^{- 1} (\frac{b_{j}}{a_{j}}) & if a_{j} > 0, \\ π / 2 & if a_{j} = 0, b_{j} > 0, \\ 0 & if a_{j} = 0, b_{j} = 0, \end{matrix} \\ φ_{q - j, 2} & = & φ_{a, j} - φ_{b, j} \mod 2 π . \end{matrix}$

Then $Q_{2, 1, 2} (φ_{q - j}) {[v_{1}^{(j)}, v_{q - j + 1}]}^{'} = {[v_{1}^{(j + 1)}, 0]}^{'}$ such that $v^{(j + 1)} = Q_{q, 1, q - j + 1} (θ_{q - 1}) v^{(j)} = {[v_{1}^{(j + 1)}, v_{2}, \dots, v_{q - j}, 0]}^{'}$ , with $v_{1}^{(j + 1)} \in C$ .
4.: If $j = q - 1$ , stop. Else increment j by one ( $j \to j + 1$ ) and continue at step 2.

This algorithm determines a unique vector

φ = {[φ_{1, 1}, φ_{1, 2}, . . ., φ_{q - 1, 2}]}^{'}

for every vector

v \in C^{q}

.

To set the stage for the general case, we start the discussion of the parameterization of the set of matrices

(A, B, C)

in canonical form with the MFI(1) and I(2) cases. These two cases display all ingredients required later for the general case. The MFI(1) case illustrates the usage of either real or complex Givens rotations, depending on whether the considered

C

-block corresponds to a real or complex unit root. The I(2) case highlights recursive orthogonality constraints on the parameters of the

C

-block, which are related to the polynomial cointegration properties (cf. Example 3).

3.1. The Parameterization in the MFI(1) Case

The state space unit root structure of an MFI(1) process is given by

Ω_{S} = ((ω_{1}, d_{1}^{1}), \dots, (ω_{l}, d_{1}^{l}))

. For the corresponding state space system

(A, B, C)

in canonical form, the sub-blocks of

A_{u}

are equal to

J_{k} = \bar{z_{k}} I_{d_{1}^{k}}

, the sub-blocks

B_{k}

of

B_{u}

are p.u.t. and

C_{k}^{'} C_{k} = I_{d_{1}^{k}}

, for

k = 1, \dots, l

.

Starting with the sub-blocks of

C_{u}

, it is convenient to separate the discussion of the parameterization of

C_{u}

-blocks into the real case, where

ω_{k} \in {0, π}

and

C_{k} \in R^{s \times d_{1}^{k}}

, and the complex case with

0 < ω_{k} < π

and

C_{k} \in C^{s \times d_{1}^{k}}

. For the case of real unit roots the two cases

d_{1}^{k} < s

and

d_{1}^{k} = s

have to be distinguished. For brevity of notation refer to the considered real block simply as

C \in R^{s \times d}

. Using this notation, the set of matrices to be parameterized is

\begin{matrix} O_{s, d} & : = & {C \in R^{s \times d} | C^{'} C = I_{d}} . \end{matrix}

The parameterization of

O_{s, d}

is based on the combination of real Givens rotations, as given in Definition 6, that allow transforming every matrix in

O_{s, d}

to the form

{[I_{d}, 0_{(s - d) \times d}^{'}]}^{'}

for

d < s

. For

d = s

, Givens rotations allow transforming every matrix

C \in O_{s, s}

either to

I_{s}

or

I_{s}^{-} : = diag (I_{s - 1}, - 1)

, since, compare Remark 11, for the transformed matrix

{\tilde{C}}^{(s)}

it holds that

det (C) = det ({\tilde{C}}^{(s)}) \in {- 1, 1}

. This is achieved with the following algorithm:

Set $j = 1$ and $C^{(1)} = C$ .
Transform the entries $[c_{j, j}, \dots, c_{j, d}]$ in the j-th row of $C^{(j)}$ , to $[{\tilde{c}}_{j, j}, 0, \dots, 0]$ , ${\tilde{c}}_{j, j} \geq 0$ . Since this is a row vector, this is achieved by right-multiplication of $C^{(j)}$ with transposed Givens rotations and the required parameters are obtained via the algorithm described in Remark 10. The first $j - 1$ entries of the j-th row remain unchanged. Denote the transformed matrix by $C^{(j + 1)}$ .
If $j = d - 1$ stop. Else increment j by one ( $j \to j + 1$ ) and continue at step 2.
Collect all parameters used for the Givens rotations in steps 1 to 3 in a parameter vector $θ_{R}$ . Steps 1–3 correspond to a QR decomposition of $C^{'} = Q {\tilde{C}}^{'}$ , with an orthonormal matrix Q given by the product of the Givens rotations. Please note that the first $j - 1$ entries of the j-th column of $\tilde{C} = C^{(d)}$ are equal to zero by construction.
Set $j = 0$ and ${\tilde{C}}^{(0)} = \tilde{C}$ .
Collect the entries in column $d - j$ of ${\tilde{C}}^{(j)}$ which have not been transformed to zero by previous transformations into the vector ${[c_{d - j, d - j}, c_{d + 1, d - j}, \dots, c_{s, d - j}]}^{'}$ . Using the algorithm described in Remark 10 transform this vector to ${[{\tilde{c}}_{d - j, d - j}, 0, \dots, 0]}^{'}$ by left-multiplication of ${\tilde{C}}^{(j)}$ with Givens rotations. Since Givens rotations are orthonormal, the transformed matrix ${\tilde{C}}^{(j + 1)}$ is still orthonormal implying for its entries ${\tilde{c}}_{d - j, d - j} = 1$ and ${\tilde{c}}_{i, d - j} = 0$ for all $i < d - j$ . An exception occurs if $d = s$ . In this case $c_{d - j, d - j} \in {- 1, 1}$ and no Givens rotations are defined.
If $j = d - 1$ stop. Else increment j by one ( $j \to j + 1$ ) and continue at step 6.
Collect all parameters used for the Givens rotations in steps 5 to 7 in a parameter vector $θ_{L}$ .

The parameter vector

θ = {[θ_{L}^{'}, θ_{R}^{'}]}^{'}

, contains the angles of the employed Givens rotations and provides one way of parameterizing

O_{s, d}

. The following Lemma 1 demonstrates the usefulness of this parameterization.

Lemma 1 (Properties of the parameterization of

O_{s, d}

).

Define for

d \leq s

a mapping

θ \to C_{O} (θ)

from

Θ_{O}^{R} : = {[0, 2 π)}^{d (s - d)} \times {[0, 2 π)}^{d (d - 1) / 2} \to O_{s, d}

by

\begin{matrix} C_{O} (θ) & : = & {[\prod_{i = 1}^{d} \prod_{j = 1}^{s - d} R_{s, i, d + j} (θ_{L, (s - d) (i - 1) + j})]}^{'} [\begin{matrix} I_{d} \\ 0_{(s - d) \times d} \end{matrix}] [\prod_{i = 1}^{d - 1} \prod_{j = 1}^{i} R_{d, d - i, d - i + j} (θ_{R, i (i - 1) / 2 + j})] \\ : = & R_{L} {(θ_{L})}^{'} [\begin{matrix} I_{d} \\ 0_{(s - d) \times d} \end{matrix}] R_{R} (θ_{R}), \end{matrix}

with

θ : = {[θ_{L}^{'}, θ_{R}^{'}]}^{'}

, where

θ_{L} : = {[θ_{L, 1}, \dots, θ_{L, d (s - d)}]}^{'}

and

θ_{R} : = {[θ_{R, 1}, \dots, θ_{R, d (d - 1) / 2}]}^{'}

. The following properties hold:

(i): $O_{s, d}$ is closed and bounded.
(ii): The mapping $C_{O} (\cdot)$ is infinitely often differentiable.

For

d < s

, it holds that

(iii): For every $C \in O_{s, d}$ there exists a vector $θ \in Θ_{O}^{R}$ such that

$\begin{matrix} C = C_{O} (θ) = R_{L} {(θ_{L})}^{'} [\begin{matrix} I_{d} \\ 0_{(s - d) \times d} \end{matrix}] R_{R} (θ_{R}) . \end{matrix}$

The algorithm discussed above defines the inverse mapping $C_{O}^{- 1} : O_{s, d} \to Θ_{O}^{R}$ .
(iv): The inverse mapping $C_{O}^{- 1} (\cdot)$ —the parameterization of $O_{s, d}$ —is infinitely often differentiable on the pre-image of the interior of $Θ_{O}^{R}$ . This is an open and dense subset of $O_{s, d}$ .

For

d = s

, it holds that

(v): $O_{s, s}$ is a disconnected space in $R^{s \times s}$ with two disjoint non-empty closed subsets $O_{s, s}^{+} : = {C \in R^{s \times s} | C^{'} C = I_{s}, det (C) = 1}$ and $O_{s, s}^{-} : = {C \in R^{s \times s} | C^{'} C = I_{s}, det (C) = - 1}$ .
(vi): For every $C \in O_{s, s}^{+}$ there exists a vector $θ \in Θ_{O}^{R}$ such that

$\begin{matrix} C = C_{O} (θ) = R_{L} {(θ_{L})}^{'} [\begin{matrix} I_{d} \end{matrix}] R_{R} (θ_{R}) = R_{R} (θ_{R}) . \end{matrix}$

In this case, steps 1-4 of the algorithm discussed above define the inverse mapping $C_{O}^{- 1} : O_{s, s}^{+} \to Θ_{O}^{R}$ .
(vii): Define $v : = {[π, \dots, π]}^{'} \in R^{s (s - 1) / 2}$ . Then a parameterization of $O_{s, s}$ is given by

$\begin{matrix} C_{O}^{\pm} (C) & = & \{\begin{matrix} v + C_{O}^{- 1} (C) & if C \in O_{s, s}^{+} \\ - (v + C_{O}^{- 1} (C I_{s}^{-})) & if C \in O_{s, s}^{-} . \end{matrix} . \end{matrix}$

The parameterization is infinitely often differentiable with infinitely often differentiable inverse on an open and dense subset of $O_{s, s}$ .

Remark 13.

The following arguments illustrate why

C_{O}^{- 1}

is not continuous on the pre-image of the boundary of

Θ_{O}^{R}

: Consider the unit sphere

O_{3, 1} = {C \in R^{3} | C^{'} C = {∥ C ∥}_{2} = 1}

. One way to parameterize the unit sphere is to use degrees of longitude and latitude. Two types of discontinuities occur: After fixing the location of the zero degree of longitude, i.e., the prime meridian, its anti-meridian is described by both 180

^{\circ}

W and 180

^{\circ}

E. Using the half-open interval

[0, 2 π)

in our parametrization causes a similar discontinuity. Second, the degree of longitude is irrelevant at the north pole. As seen in Remark 10, with our parameterization a similar issue occurs when the first two entries of C to be compared are both equal to zero. In this case the parameter of the Givens rotation is set to zero, although every θ will produce the same result. Both discontinuities clearly occur on a thin subset of

O_{s, d}

.

As in the parametrization of the VAR I(1)-case in the VECM framework, where the restriction

β = {[I_{s - d}, β^{*}]}^{'}

can only be imposed when the upper

(s - d) \times (s - d)

block of the true

β_{0}

of the DGP is of full rank (cf. Johansen 1995, chp. 5.2), the set where the discontinuities occur can effectively be changed by a permutation of the components of the observed time series. This corresponds to redefining the locations of the prime meridian and the poles.

Remark 14.

Please note that the parameterization partitions the parameter vector θ into two parts

θ_{L} \in {[0, 2 π)}^{d (s - d)}

and

θ_{R} \in {[0, 2 π)}^{(d - 1) d / 2}

. Since changing the parameter values in

θ_{R}

does not change the column space of

C_{O} (θ)

, which, as seen above, determines the cointegrating vectors,

θ_{L}

fully characterizes the (static) cointegrating space. Please note that the dimension of

θ_{L}

is

d (s - d)

and thus coincides with the number of free parameters in β in the VECM framework (cf. Johansen 1995, chp. 5.2).

Example 5.

Consider the matrix

\begin{matrix} C & = & \begin{matrix} [\begin{matrix} 0 & \frac{1}{\sqrt{2}} \\ \frac{- 1}{\sqrt{2}} & \frac{1}{2} \\ \frac{1}{\sqrt{2}} & \frac{1}{2} \end{matrix}] \end{matrix} \end{matrix}

with

d = 2

and

s = 3

. As discussed, the static cointegrating space is characterized by the left kernel of this matrix. The left kernel of a matrix in

R^{3 \times 2}

with full rank two is given by a one-dimensional space, with the corresponding basis vector parameterized, when normalized to length one, by two free parameters. Thus, for the characterization of the static cointegrating space two parameters are required, which exactly coincides with the dimension of

θ_{L}

given in Remark 14. The parameters in

θ_{R}

correspond to the choice of a basis of the image of C. Having fixed the two-dimensional subspace through

θ_{L}

, only one free parameter for the choice of an orthonormal basis remains, which again coincides with the dimension given in Remark 14. To obtain the parameter vector, the starting point is a QR decomposition of

C^{'} = R_{R} (θ_{R}) {\tilde{C}}^{'}

. In this example

R_{R} (θ_{R}) = R_{2, 1, 2} (θ_{R, 1})

, with

θ_{R, 1}

to be determined. To find

θ_{R, 1}

, solve

[\begin{matrix} 0 & \frac{1}{\sqrt{2}} \end{matrix}] R_{2, 1, 2} {(θ_{R, 1})}^{'} = [\begin{matrix} r & 0 \end{matrix}]

for

r \geq 0

and

θ_{R, 1} \in [0, 2 π)

. In other words, find

r \geq 0

and

θ_{R, 1} \in [0, 2 π)

such that

[\begin{matrix} 0 & \frac{1}{\sqrt{2}} \end{matrix}] = r [\begin{matrix} cos (θ_{R, 1}) & sin (θ_{R, 1}) \end{matrix}]

, which leads to

r = \frac{1}{\sqrt{2}}

,

θ_{R, 1} = \frac{π}{2}

. Thus, the orthonormal matrix

R_{R} (θ_{R})

is equal to

R_{2, 1, 2} (\frac{π}{2})

and the transpose of the upper triangular matrix

{\tilde{C}}^{'}

is equal to:

\begin{matrix} \tilde{C} = {\tilde{C}}^{(0)} = C \cdot R_{2, 1, 2} {(\frac{π}{2})}^{'} = \begin{matrix} [\begin{matrix} 0 & \frac{1}{\sqrt{2}} \\ \frac{- 1}{\sqrt{2}} & \frac{1}{2} \\ \frac{1}{\sqrt{2}} & \frac{1}{2} \end{matrix}] [\begin{matrix} 0 & - 1 \\ 1 & 0 \end{matrix}] \end{matrix} = \begin{matrix} [\begin{matrix} \frac{1}{\sqrt{2}} & 0 \\ \frac{1}{2} & \frac{1}{\sqrt{2}} \\ \frac{1}{2} & - \frac{1}{\sqrt{2}} \end{matrix}] \end{matrix} . \end{matrix}

Second, transform the entries in the lower

1 \times 2

-sub-block of

{\tilde{C}}^{(0)}

to zero, starting with the last column. For this find

θ_{L, 2} \in [0, 2 π)

such that

R_{3, 2, 3} (θ_{L, 2}) {[\begin{matrix} 0 & \frac{1}{\sqrt{2}} & - \frac{1}{\sqrt{2}} \end{matrix}]}^{'} = {[\begin{matrix} 0 & 1 & 0 \end{matrix}]}^{'}

, i.e.,

{[\begin{matrix} \frac{1}{\sqrt{2}} & - \frac{1}{\sqrt{2}} \end{matrix}]}^{'} = r [\begin{matrix} cos (θ_{L, 2}) & sin (θ_{L, 2}) \end{matrix}]

. This yields

r = 1

,

θ_{L, 2} = \frac{7 π}{4}

. Next compute

{\tilde{C}}^{(1)} = R_{3, 2, 3} (\frac{7 π}{4}) {\tilde{C}}^{(0)}

:

\begin{matrix} \begin{matrix} {\tilde{C}}^{(1)} = R_{3, 2, 3} (\frac{7 π}{4}) \cdot C \cdot R_{2, 1, 2} {(\frac{π}{2})}^{'} = [\begin{matrix} 1 & 0 & 0 \\ 0 & \frac{1}{\sqrt{2}} & \frac{- 1}{\sqrt{2}} \\ 0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \end{matrix}] [\begin{matrix} 0 & \frac{1}{\sqrt{2}} \\ \frac{- 1}{\sqrt{2}} & \frac{1}{2} \\ \frac{1}{\sqrt{2}} & \frac{1}{2} \end{matrix}] [\begin{matrix} 0 & - 1 \\ 1 & 0 \end{matrix}] \end{matrix} & = & \begin{matrix} [\begin{matrix} \frac{1}{\sqrt{2}} & 0 \\ 0 & 1 \\ \frac{1}{\sqrt{2}} & 0 \end{matrix}] \end{matrix} . \end{matrix}

In the final step find

θ_{L, 1} \in [0, 2 π)

such that

R_{3, 1, 3} (θ_{L, 1}) {[\begin{matrix} \frac{1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}} \end{matrix}]}^{'} = {[\begin{matrix} 1 & 0 & 0 \end{matrix}]}^{'}

, i.e.,

{[\begin{matrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \end{matrix}]}^{'} = r [\begin{matrix} cos (θ_{L, 1}) & sin (θ_{L, 1}) \end{matrix}]

. The solution is

r = 1

,

θ_{L, 1} = \frac{π}{4}

. Combining the transformations leads to

\begin{matrix} R_{3, 1, 3} (\frac{π}{4}) \cdot R_{3, 2, 3} (\frac{7 π}{4}) \cdot C \cdot R_{2, 1, 2} {(\frac{π}{2})}^{'} & = \\ \begin{matrix} [\begin{matrix} \frac{1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}} \\ 0 & 1 & 0 \\ \frac{- 1}{\sqrt{2}} & 0 & \frac{1}{\sqrt{2}} \end{matrix}] [\begin{matrix} 1 & 0 & 0 \\ 0 & \frac{1}{\sqrt{2}} & \frac{- 1}{\sqrt{2}} \\ 0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \end{matrix}] [\begin{matrix} 0 & \frac{1}{\sqrt{2}} \\ \frac{- 1}{\sqrt{2}} & \frac{1}{2} \\ \frac{1}{\sqrt{2}} & \frac{1}{2} \end{matrix}] [\begin{matrix} 0 & - 1 \\ 1 & 0 \end{matrix}] \end{matrix} & = & \begin{matrix} [\begin{matrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{matrix}] \end{matrix} . \end{matrix}

The parameter vector for this matrix is therefore

θ = {[θ_{L}^{'}, θ_{R}^{'}]}^{'} = {[[\frac{π}{4}, \frac{7 π}{4}], [\frac{π}{2}]]}^{'}

with

θ = C_{O}^{- 1} (C)

.

In case of complex unit roots, referring for brevity again to the considered block

C_{k}

simply as

C \in C^{s \times d}

, the set of matrices to be parameterized is

\begin{matrix} U_{s, d} & : = & {C \in C^{s \times d} | C^{'} C = I_{d}} . \end{matrix}

The parameterization of this set is based on the combination of complex Givens rotations, as given in Definition 7, which can be used to transform every matrix in

U_{s, d}

to the form

{[D_{d}, 0_{(s - d) \times d}^{'}]}^{'}

with a diagonal matrix

D_{d}

whose diagonal elements are of unit modulus. This transformation is achieved with the following algorithm:

Set $j = 1$ and $C^{(1)} = C$ .
Transform the entries $[c_{j, j}, \dots, c_{j, d}]$ in the j-th row of $C^{(j)}$ , to $[{\tilde{c}}_{j, j}, 0, \dots, 0]$ . Since this is a row vector, this is achieved by right-multiplication of $C$ with transposed Givens rotations and the required parameters are obtained via the algorithm described in Remark 12. The first $j - 1$ entries of the j-th row remain unchanged. Denote the transformed matrix by $C^{(j + 1)}$ .
If $j = d - 1$ stop. Else increment j by one ( $j \to j + 1$ ) and continue at step 2.
Collect all parameters used for the Givens rotations in steps 1 to 3 in a parameter vector $φ_{R}$ . Step 1–3 corresponds to a QR decomposition of $C^{'} = Q {\tilde{C}}^{'}$ , with a unitary matrix Q given by the product of the Givens rotations. Please note that the first $j - 1$ entries of the j-th column of $\tilde{C} = C^{(d)}$ are equal to zero by construction.
Set $j = 0$ and ${\tilde{C}}^{(0)} = \tilde{C}$ .
Collect the entries in column $d - j$ of ${\tilde{C}}^{(j)}$ which have not been transformed to zero by previous transformations into the vector ${[c_{d - j, d - j}, c_{d + 1, d - j}, \dots, c_{s, d - j}]}^{'}$ . Using the algorithm described in Remark 12 transform this vector to ${[{\tilde{c}}_{d - j, d - j}, 0, \dots, 0]}^{'}$ by left-multiplication of ${\tilde{C}}^{(j)}$ with Givens rotations. Since Givens rotations are unitary, the transformed matrix ${\tilde{C}}^{(j + 1)}$ is still unitary implying for its entries $| {\tilde{c}}_{d - j, d - j} | = 1$ and ${\tilde{c}}_{i, d - j} = 0$ for all $i < d - j$ . An exception occurs if $d = s$ . In this case $| c_{d - j, d - j} | = 1$ and no Givens rotations are defined.
If $j = d - 1$ stop. Else increment j by one ( $j \to j + 1$ ) and continue at step 6.
Collect all parameters used for the Givens rotations in steps 5 to 7 in a parameter vector $φ_{L}$ .
Transform the diagonal entries of the transformed matrix ${\tilde{C}}^{(d)} = {[D_{d}, 0_{(s - d) \times d}^{'}]}^{'}$ into polar coordinates and collect the angles in a parameter vector $φ_{D}$ .

The following lemma demonstrates the usefulness of this parameterization.

Lemma 2 (Properties of the parametrization of

U_{s, d}

).

Define for

d \leq s

a mapping

φ \to C_{U} (φ)

from

Θ_{U}^{C} : = Θ_{C}^{d (s - d)} \times Θ_{C}^{(d - 1) d / 2} \times {[0, 2 π)}^{d} \to U_{s, d}

by

\begin{matrix} C_{U} (φ) & : = & {[\prod_{i = 1}^{d} \prod_{j = 1}^{s - d} Q_{s, i, d + j} (φ_{L, (s - d) (i - 1) + j})]}^{'} [\begin{matrix} D_{d} (φ_{D}) \\ 0_{(s - d) \times d} \end{matrix}] [\prod_{i = 1}^{d - 1} \prod_{j = 1}^{i} Q_{d, d - i, d - i + j} (φ_{R, i (i - 1) / 2 + j})] \\ : = & Q_{L} {(φ_{L})}^{'} [\begin{matrix} D_{d} (φ_{D}) \\ 0_{(s - d) \times d} \end{matrix}] Q_{R} (φ_{R}), \end{matrix}

with

φ : = {[φ_{L}^{'}, φ_{R}^{'}, φ_{D}^{'}]}^{'}

, where

φ_{L} = {[φ_{L, 1}, \dots, φ_{L, d (s - d)}]}^{'}

,

φ_{R} : = {[φ_{R, 1}, \dots, φ_{R, d (d - 1) / 2}]}^{'}

and

φ_{D} : = {[φ_{D, 1}, \dots, φ_{D, d}]}^{'}

and where

D_{d} (φ_{D}) = d i a g (e^{i φ_{D, 1}}, \dots, e^{i φ_{D, d}})

. The following properties hold:

(i): $U_{s, d}$ is closed and bounded.
(ii): The mapping $C_{U} (φ)$ is infinitely often differentiable.
(iii): For every $C \in U_{s, d}$ a vector $φ \in Θ_{U}^{C}$ exists such that

$\begin{matrix} C = C_{U} (φ) = Q_{L} {(φ_{L})}^{'} [\begin{matrix} D_{d} (φ_{D}) \\ 0_{(s - d) \times d} \end{matrix}] Q_{R} (φ_{R}) . \end{matrix}$

The algorithm discussed above defines the inverse mapping $C_{U}^{- 1} : U_{s, d} \to Θ_{U}^{R}$ .
(iv): The inverse mapping $C_{U}^{- 1} (\cdot)$ —the parameterization of $U_{s, d}$ —is infinitely often differentiable on an open and dense subset of $U_{s, d}$ .

Remark 15.

Note the partitioning of the parameter vector φ into the parts

φ_{L}

,

φ_{D}

and

φ_{R}

. The component

φ_{L}

fully characterizes the column space of

C_{U} (φ)

, i.e.,

φ_{L}

determines the cointegrating spaces.

Example 6.

Consider the matrix

\begin{matrix} C & = & \begin{matrix} [\begin{matrix} \frac{1 - i}{2} & \frac{1 - i}{2} \\ \frac{1 + i}{2} & \frac{- 1 - i}{2} \\ 0 & 0 \end{matrix}] \end{matrix} . \end{matrix}

The starting point is again a QR decomposition of

C^{'} = Q_{R} (φ_{R}) {\tilde{C}}^{'} = Q_{2, 1, 2} (φ_{R, 1}) {\tilde{C}}^{'}

. To find a complex Givens rotation such that

[\begin{matrix} \frac{1 - i}{2} & \frac{1 - i}{2} \end{matrix}] Q_{2, 1, 2} {(φ_{R, 1})}^{'} = [\begin{matrix} r e^{i φ_{a}} & 0 \end{matrix}]

with

r > 0

, transform the entries of

{[\begin{matrix} \frac{1 - i}{2} & \frac{1 - i}{2} \end{matrix}]}^{'}

into polar coordinates. The equation

{[\begin{matrix} \frac{1 - i}{2} & \frac{1 - i}{2} \end{matrix}]}^{'} = {[\begin{matrix} a e^{i φ_{a}} & b e^{i φ_{b}} \end{matrix}]}^{'}

has the solutions

a = b = \frac{1}{\sqrt{2}}

and

φ_{a} = φ_{b} = \frac{7 π}{4}

. Using the results of Remark 12, the parameters of the Givens rotation are

φ_{R, 1, 1} = \tan^{- 1} (\frac{b}{a}) = \frac{π}{4}

and

φ_{R, 1, 2} = φ_{a} - φ_{b} = 0

. Right-multiplication of C with

Q_{2, 1, 2} {([\frac{π}{4}, 0])}^{'}

leads to

\begin{matrix} \tilde{C} = C Q_{2, 1, 2} {([\frac{π}{4}, 0])}^{'} = C {[\begin{matrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ \frac{- 1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \end{matrix}]}^{'} = [\begin{matrix} \frac{1 - i}{\sqrt{2}} & 0 \\ 0 & \frac{- 1 - i}{\sqrt{2}} \\ 0 & 0 \end{matrix}] = [\begin{matrix} D_{2} (φ_{D}) \\ 0_{1 \times 2} \end{matrix}] . \end{matrix}

Since the entries in the lower

1 \times 2

-sub-block of

\tilde{C}

are already equal to zero, the remaining complex Givens rotations are

Q_{3, 2, 3} ([0, 0]) = Q_{3, 1, 3} ([0, 0]) = I_{3}

. Finally, the parameter values corresponding to the diagonal matrix

D_{2} (φ_{D}) = d i a g (e^{i φ_{D, 1}}, e^{i φ_{D, 2}}) = d i a g (\frac{1 - i}{\sqrt{2}}, \frac{- 1 - i}{\sqrt{2}})

are

φ_{D, 1} = \frac{3 π}{4}

and

φ_{D, 2} = \frac{5 π}{4}

.

The parameter vector for this matrix is therefore

φ = {[φ_{L}^{'}, φ_{R}^{'}, φ_{D}^{'}]}^{'} = {[[0, 0, 0, 0], [\frac{π}{4}, 0], [\frac{3 π}{4}, \frac{5 π}{4}]]}^{'}

, with

φ = C_{U}^{- 1} (C)

.

Components of the Parameter Vector

Based on the results of the preceding sections we can now describe the parameter vectors for the general case. The dimensions of the parameter vectors of the respective blocks of the system matrices

(A, B, C)

depend on the multi-index

Γ

, consisting of the state space unit root structure

Ω_{S}

, the structure indices p and the Kronecker indices

α_{•}

for the stable subsystem. A parameterization of the set of all systems in canonical form with given multi-index

Γ

for the MFI(1) case, therefore, combines the following components:

$θ_{B, f} : = {[θ_{B, f, 1}^{'}, . . ., θ_{B, f, l}^{'}]}^{'} \in Θ_{B, f} = R^{d_{B, f}}$ , with:

$\begin{matrix} θ_{B, f, k} : = \{\begin{matrix} {[b_{1, p_{1}^{k} + 1}^{k}, b_{1, p_{1}^{k} + 2}^{k}, \dots, b_{1, s}^{k}, b_{2, p_{2}^{k} + 1}^{k}, \dots, b_{d_{1}^{k}, s}^{k}]}^{'} for ω_{k} \in {0, π}, \\ R (b_{1, p_{1}^{k} + 1}^{k}), I (b_{1, p_{1}^{k} + 1}^{k}), R (b_{1, p_{1}^{k} + 2}^{k}), \dots, I (b_{1, s}^{k}), R (b_{2, p_{2}^{k} + 1}^{k}), \dots, I (b_{d_{1}^{k}, s}^{k})]^{'} \\ for 0 < ω_{k} < π, \end{matrix} \end{matrix}$

for $k = 1, \dots, l$ , with $p_{j}^{k}$ denoting the j-th entry of the structure indices p corresponding to $B_{k}$ . The vectors $θ_{B, f, k}$ contain the real and imaginary parts of free entries in $B_{k}$ not restricted by the p.u.t. structures.
$θ_{B, p} : = {[θ_{B, p, 1}^{'}, . . ., θ_{B, p, l}^{'}]}^{'} \in Θ_{B, p} = R_{+}^{d_{B, p}}$ : The vectors $θ_{B, p, k} : = {[b_{1, p_{1}^{k}}^{k}, \dots, b_{d_{1}^{k}, p_{d_{1}^{k}}^{k}}^{k}]}^{'}$ contain the entries in $B_{k}$ restricted by the p.u.t. structures to be positive reals.
$θ_{C, E} : = {[θ_{C, E, 1}^{'}, . . ., θ_{C, E, l}^{'}]}^{'} \in Θ_{C, E} \subset R^{d_{C, E}}$ : The parameters for the matrices $C_{k}$ as discussed in Lemma 1 and Lemma 2.
$θ_{•} \in Θ_{•, α} \subset R^{d_{•}}$ : The parameters for the stable subsystem in echelon canonical form for Kronecker indices $α_{•}$ .

Example 7.

Consider an MFI(1) process with

Ω_{S} = ((0, 2), (\frac{π}{2}, 2))

,

p = {[1, 3, 1, 2, 1, 2]}^{'}

,

n_{•} = 0

, and system matrices

\begin{matrix} A & = & d i a g (1, 1, i, i, - i, - i), \\ B & = & \begin{matrix} [\begin{matrix} 1 & - 1 & 2 \\ 0 & 0 & 2 \\ 1 & 1 + i & 1 - i \\ 0 & 2 & i \\ 1 & 1 - i & 1 + i \\ 0 & 2 & - i \end{matrix}] \end{matrix}, C = \begin{matrix} [\begin{matrix} 0 & \frac{1}{\sqrt{2}} & \frac{1 - i}{2} & \frac{1 - i}{2} & \frac{1 + i}{2} & \frac{1 + i}{2} \\ \frac{- 1}{\sqrt{2}} & \frac{1}{2} & \frac{1 + i}{2} & \frac{- 1 - i}{2} & \frac{1 - i}{2} & \frac{- 1 + i}{2} \\ \frac{1}{\sqrt{2}} & \frac{1}{2} & 0 & 0 & 0 & 0 \end{matrix}] \end{matrix}, \end{matrix}

in canonical form. For this example it holds that

θ_{B, f} = {[[- 1, 2], [1, 1, 1, - 1, 0, 1]]}^{'}

,

θ_{B, p} = [[1, 2], [1, 2]]

and

\begin{matrix} θ_{C, E} & = & \begin{matrix} {[[[\frac{π}{4}, \frac{7 π}{4}], [\frac{π}{2}]], [[0, 0, 0, 0], [\frac{π}{4}, 0], [\frac{3 π}{4}, \frac{5 π}{4}]]]}^{'} \end{matrix}, \end{matrix}

with parameter values corresponding to the C-blocks collected in

θ_{C, E}

considered in Examples 5 and 6.

3.2. The Parameterization in the I(2) Case

The canonical form provided above for the general case has the following form for I(2) processes with unit root structure

Ω_{s} = ((0, d_{1}^{1}, d_{2}^{1}))

:

\begin{matrix} A = [\begin{matrix} I_{d_{1}^{1}} & I_{d_{1}^{1}} & 0 & 0 \\ 0 & I_{d_{1}^{1}} & 0 & 0 \\ 0 & 0 & I_{d_{2}^{1} - d_{1}^{1}} & 0 \\ 0 & 0 & 0 & A_{•} \end{matrix}], & B = [\begin{matrix} B_{1, 1} \\ B_{1, 2, 1} \\ B_{1, 2, 2} \\ B_{•} \end{matrix}], & C = [\begin{matrix} C_{1, 1}^{E} & C_{1, 2}^{G} & C_{1, 2}^{E} & C_{•} \end{matrix}], \end{matrix}

where

0 < d_{1}^{1} \leq d_{2}^{1} \leq s

,

B_{1, 2, 1}

and

B_{1, 2, 2}

are p.u.t.,

C_{1, 1}^{E} \in O_{s, d_{1}^{1}}

,

C_{1, 2}^{E} \in O_{s, d_{2}^{1} - d_{1}^{1}}

,

{(C_{1, 1}^{E})}^{'} C_{1, 2}^{E} = 0_{d_{1}^{1} \times d_{2}^{1}}

,

{(C_{1, 1}^{E})}^{'} C_{1, 2}^{G} = 0_{d_{1}^{1} \times d_{1}^{1}}

,

{(C_{1, 2}^{E})}^{'} C_{1, 2}^{G} = 0_{(d_{2}^{1} - d_{1}^{1}) \times d_{1}^{1}}

and

(A_{•}, B_{•}, C_{•})

is in echelon canonical form with Kronecker indices

α_{•}

. All matrices are real valued.

The parameterizations of the p.u.t. matrices

B_{1, 2, 1}

and

B_{1, 2, 2}

are as discussed above. The entries of

B_{1, 1}

are unrestricted and thus included in the parameter vector

θ_{B, f}

containing also the free entries in

B_{1, 2, 1}

and

B_{1, 2, 2}

. The subsystem

(A_{•}, B_{•}, C_{•})

is parameterized using the echelon canonical form.

The parameterization of

C_{1, 1}^{E} \in O_{s, d_{1}^{1}}

proceeds as in the MFI(1) case, using

C_{O}^{- 1} (C_{1, 1}^{E})

. The parameterization of

C_{1, 2}^{E}

has to take the restriction of orthogonality of

C_{1, 2}^{E}

to

C_{1, 1}^{E}

into account, thus the set to be parameterized is given by

\begin{matrix} O_{s, d_{2}^{1} - d_{1}^{1}} (C_{1, 1}^{E}) & : = & {C_{1, 2}^{E} \in R^{s \times (d_{2}^{1} - d_{1}^{1})} | {(C_{1, 1}^{E})}^{'} C_{1, 2}^{E} = 0_{d_{1}^{1} \times (d_{2}^{1} - d_{1}^{1})}, {(C_{1, 2}^{E})}^{'} C_{1, 2}^{E} = I_{d_{2}^{1} - d_{1}^{1}}} . \end{matrix}

(13)

The parameterization of this set again uses real Givens rotations. For

C \in O_{s, d_{2}^{1} - d_{1}^{1}} (C_{1, 1}^{E})

it follows that

R_{L} (θ_{L}) C = {[0_{d_{1}^{1} \times (d_{2}^{1} - d_{1}^{1})}^{'}, {\tilde{C}}^{'}]}^{'}

for a matrix

\tilde{C}

such that

{\tilde{C}}^{'} \tilde{C} = I_{d_{2}^{1} - d_{1}^{1}}

with

R_{L} (θ_{L})

corresponding to

C_{1, 1}^{E}

. The matrix

\tilde{C}

is parameterized as discussed in Lemma 1.

Corollary 1 (Properties of the parameterization of

O_{s, d_{2}^{1} - d_{1}^{1}} (C_{1, 1}^{E})

).

Define for

d_{1}^{1} < d_{2}^{1} \leq s

a mapping

\tilde{θ} \to C_{O, d_{2}^{1} - d_{1}^{1}} (\tilde{θ}; C_{1, 1}^{E})

from

Θ_{O, d_{2}^{1}}^{R} : = {[0, 2 π)}^{(d_{2}^{1} - d_{1}^{1}) (s - d_{2}^{1})} \times {[0, 2 π)}^{(d_{2}^{1} - d_{1}^{1}) (d_{2}^{1} - d_{1}^{1} - 1) / 2} \to O_{s, d_{2}^{1} - d_{1}^{1}} (C_{1, 1}^{E})

by

\begin{matrix} C_{O, d_{2}^{1} - d_{1}^{1}} (\tilde{θ}; C_{1, 1}^{E}) & : = & R_{L} {(θ_{L})}^{'} [\begin{matrix} 0_{d_{1}^{1} \times (d_{2}^{1} - d_{1}^{1})} \\ C_{O} (\tilde{θ}) \end{matrix}], \end{matrix}

where

θ_{L}

denotes the parameter values corresponding to

{[θ_{L}^{'}, θ_{R}^{'}]}^{'} = C_{O}^{- 1} (C_{1, 1}^{E})

as defined in Lemma 1. The following properties hold:

(i): $O_{s, d_{2}^{1} - d_{1}^{1}} (C_{1, 1}^{E})$ is closed and bounded.
(ii): The mapping $C_{O, d_{2}^{1} - d_{1}^{1}} (\tilde{θ}; C_{1, 1}^{E})$ is infinitely often differentiable.

For

d_{2}^{1} < s

, it holds

(iii): For every $C_{1, 2}^{E} \in O_{s, d_{2}^{1} - d_{1}^{1}} (C_{1, 1}^{E})$ there exists a vector $\tilde{θ} = {[{\tilde{θ}}_{L}^{'}, {\tilde{θ}}_{R}^{'}]}^{'} \in Θ_{O, d_{2}^{1} - d_{1}^{1}}^{R}$ such that

$\begin{matrix} C_{1, 2}^{E} = C_{O, d_{2}^{1} - d_{1}^{1}} (\tilde{θ}; C_{1, 1}^{E}) = R_{L} {(θ_{L})}^{'} [\begin{matrix} 0_{d_{1}^{1} \times (d_{2}^{1} - d_{1}^{1})} \\ R_{L} {({\tilde{θ}}_{L})}^{'} [\begin{matrix} I_{d_{2}^{1} - d_{1}^{1}} \\ 0_{(s - d_{2}^{1}) \times (d_{2}^{1} - d_{1}^{1})} \end{matrix}] R_{R} ({\tilde{θ}}_{R}) \end{matrix}] . \end{matrix}$

The algorithm discussed above Lemma 1 defines the inverse mapping $C_{O, d_{2}^{1} - d_{1}^{1}}^{- 1}$ .
(iv): The inverse mapping $C_{O, d_{2}^{1} - d_{1}^{1}}^{- 1} (\cdot; C_{1, 1}^{E})$ —the parameterization of $O_{s, d_{2}^{1} - d_{1}^{1}} (C_{1, 1}^{E})$ —is infinitely often differentiable on the pre-image of the interior of $Θ_{O, d_{2}^{1} - d_{1}^{1}}^{R}$ . This is an open and dense subset of $O_{s, d_{2}^{1} - d_{1}^{1}} (C_{1, 1}^{E})$ .

For

d_{2}^{1} = s

, it holds that

(v): $O_{s, s - d_{1}^{1}} (C_{1, 1}^{E})$ is a disconnected space with two disjoint non-empty closed subsets:

$\begin{matrix} O_{s, s - d_{1}^{1}}^{+} (C_{1, 1}^{E}) : = \\ {C_{1, 2}^{E} \in R^{s \times (s - d_{1}^{1})} | {(C_{1, 1}^{E})}^{'} C_{1, 2}^{E} = 0_{d_{1}^{1} \times (s - d_{1}^{1})}, {(C_{1, 2}^{E})}^{'} C_{1, 2}^{E} = I_{s - d_{1}^{1}}, det ([C_{1, 1}^{E}, C_{1, 2}^{E}]) = 1}, \\ O_{s, s - d_{1}^{1}}^{-} (C_{1, 1}^{E}) : = \\ {C_{1, 2}^{E} \in R^{s \times (s - d_{1}^{1})} | {(C_{1, 1}^{E})}^{'} C_{1, 2}^{E} = 0_{d_{1}^{1} \times (s - d_{1}^{1})}, {(C_{1, 2}^{E})}^{'} C_{1, 2}^{E} = I_{s - d_{1}^{1}}, det ([C_{1, 1}^{E}, C_{1, 2}^{E}]) = - 1} . \end{matrix}$
(vi): For every $O_{s, s - d_{1}^{1}}^{+} (C_{1, 1}^{E})$ there exists a vector $\tilde{θ} \in Θ_{O, d_{2}^{1} - d_{1}^{1}}^{R}$ such that

$\begin{matrix} C_{1, 2}^{E} = C_{O, s - d_{1}^{1}} (\tilde{θ}; C_{1, 1}^{E}) = R_{R} ({\tilde{θ}}_{R}) . \end{matrix}$

Steps 1–4 of the algorithm discussed above Lemma 1 define the inverse mapping $C_{O, s - d_{1}^{1}}^{- 1} (\cdot; C_{1, 1}^{E}) : O_{s, s - d_{1}^{1}}^{+} (C_{1, 1}^{E}) \to Θ_{O, s - d_{1}^{1}}^{R}$ .
(vii): Define $v : = {[π, \dots, π]}^{'} \in R^{(s - d_{1}^{1}) (s - d_{1}^{1} - 1) / 2}$ . Then a parameterization of $O_{s, s - d_{1}^{1}} (C_{1, 1}^{E})$ is given by

$\begin{matrix} C_{O, s - d_{1}^{1}}^{\pm} (C_{1, 2}^{E}; C_{1, 1}^{E}) & = & \{\begin{matrix} v + C_{O, s - d_{1}^{1}}^{- 1} (C_{1, 2}^{E}; C_{1, 1}^{E}) & if C \in O_{s, s - d_{1}^{1}}^{+} (C_{1, 1}^{E}) \\ - (v + C_{O, s - d_{1}^{1}}^{- 1} (C_{1, 2}^{E} I_{s - d_{1}^{1}}^{-}; C_{1, 1}^{E})) & if C \in O_{s, s - d_{1}^{1}}^{-} (C_{1, 1}^{E}) . \end{matrix} \end{matrix}$

The parameterization is infinitely often differentiable with infinitely often differentiable inverse on an open and dense subset of $O_{s, s}$ .

The proof of Corollary 1 uses the same arguments as the proof of Lemma 1 and is, therefore, omitted. It remains to provide a parameterization for

C_{1, 2}^{G}

restricted to be orthogonal to both

C_{1, 1}^{E}

and

C_{1, 2}^{E}

. Thus, the set to be parametrized is given by

\begin{matrix} O_{s, G} (C_{1, 1}^{E}, C_{1, 2}^{E}) & : = & {C_{1, 2}^{G} \in R^{s \times d_{1}^{1}} | {(C_{1, 1}^{E})}^{'} C_{1, 2}^{G} = 0_{d_{1}^{1} \times d_{1}^{1}}, {(C_{1, 2}^{E})}^{'} C_{1, 2}^{G} = 0_{(d_{2}^{1} - d_{1}^{1}) \times d_{1}^{1}}} . \end{matrix}

The parameterization of

O_{s, G} (C_{1, 1}^{E}, C_{1, 2}^{E})

is straightforward: Left multiplication of

C_{1, 2}^{G}

with

R_{L} (θ_{L})

as defined in Lemma 1 and of the lower

(s - d_{1}^{1}) \times d_{1}^{1}

- block with

R_{L} ({\tilde{θ}}_{L})

as defined in Corollary 1 transforms the upper

d_{2}^{1} \times d_{1}^{1}

-block to zero and collects the free parameters in the lower

(s - d_{2}^{1}) \times d_{1}^{1}

-block. Clearly, this is a bijective and infinitely often differentiable mapping on

O_{s, G} (C_{1, 1}^{E}, C_{1, 2}^{E})

and thus a useful parameterization, since the matrix

C_{1, 2}^{G}

is only multiplied with two constant invertible matrices. The entries of the matrix product are then collected in a parameter vector as shown in Corollary 2.

Corollary 2 (Properties of the parameterization of

O_{s, G} (C_{1, 1}^{E}, C_{1, 2}^{E})

).

Define for given matrices

C_{1, 1}^{E} \in O_{s, d_{1}^{1}}

and

C_{1, 2}^{E} \in O_{s, d_{2}^{1} - d_{1}^{1}} (C_{1, 1}^{E})

a mapping

λ \to C_{O, G} (λ; C_{1, 1}^{E}, C_{1, 2}^{E})

from

R^{d_{1}^{1} (s - d_{2}^{1})} \to O_{s, G} (C_{1, 1}^{E}, C_{1, 2}^{E})

by

\begin{matrix} C_{O, G} (λ; C_{1, 1}^{E}, C_{1, 2}^{E}) & : = & R_{L} {(θ_{L})}^{'} [\begin{matrix} 0_{d_{1}^{1} \times d_{1}^{1}} \\ R_{L} {({\tilde{θ}}_{L})}^{'} [\begin{matrix} 0_{(d_{2}^{1} - d_{1}^{1}) \times 1} & \dots & 0_{(d_{2}^{1} - d_{1}^{1}) \times 1} \\ λ_{1} & \dots & λ_{d_{1}^{1}} \\ λ_{d_{1}^{1} + 1} & \dots & λ_{2 d_{1}^{1}} \\ ⋮ & ⋮ \\ λ_{d_{1}^{1} (s - d_{2}^{1} - 1) + 1} & \dots & λ_{d_{1}^{1} (s - d_{2}^{1})} \end{matrix}] \end{matrix}], \end{matrix}

where

θ_{L}

denotes the parameter values corresponding to

{[θ_{L}^{'}, θ_{R}^{'}]}^{'} = C_{O}^{- 1} (C_{1, 1}^{E})

as defined in Lemma 1 and

{\tilde{θ}}_{L}

denotes the parameter values corresponding to

{[{\tilde{θ}}_{L}^{'}, {\tilde{θ}}_{R}^{'}]}^{'} = C_{O, d_{2}^{1} - d_{1}^{1}}^{- 1} (C_{1, 2}^{E}; C_{1, 1}^{E})

as defined in Corollary 1. The set

O_{s, G} (C_{1, 1}^{E}, C_{1, 2}^{E})

is closed and both

C_{O, G}

as well as

C_{O, G}^{- 1} (\cdot)

—the parameterization of

O_{s, G} (C_{1, 1}^{E}, C_{1, 2}^{E})

—are infinitely often differentiable.

Components of the Parameter Vector

In the I(2) case, the multi-index

Γ

contains the state space unit root structure

Ω_{S} = ((0, d_{1}^{1}, d_{2}^{1}))

, the structure indices

p \in N_{0}^{d_{1}^{1} + d_{2}^{1}}

, encoding the p.u.t. structures of

B_{1, 2, 1}

and

B_{1, 2, 2}

, and the Kronecker indices

α_{•}

for the stable subsystem. The parameterization of the set of all systems in canonical form with given multi-index

Γ

for the I(2) case uses the following components:

$θ_{B, f} : = θ_{B, f, 1} \in Θ_{B, f} = R^{d_{B, f}}$ : The vector $θ_{B, f, 1}$ contains the free entries in $B_{1}$ not restricted by the p.u.t. structure, collected in the same order as for the matrices $B_{k}$ in the MFI(1) case.
$θ_{B, p} : = θ_{B, p, 1} \in Θ_{B, p} = R_{+}^{d_{B, p}}$ : The vector $θ_{B, p, 1} : = {[b_{d_{1}^{1} + 1, p_{d_{1}^{1} + 1}^{1}}^{1}, \dots, b_{d_{1}^{1} + d_{2}^{1}, p_{d_{1}^{1} + d_{2}^{1}}^{1}}^{1}]}^{'}$ contains the entries in $B_{1}$ restricted by the p.u.t. structures to be positive reals.
$θ_{C, E} : = {[θ_{C, E, 1, 1}^{'}, θ_{C, E, 1, 2}^{'}]}^{'} \in Θ_{C, E} \subset R^{d_{C, E}}$ : The parameters for the matrices $C_{1, 1}^{E}$ as in the MFI(1) case and $C_{1, 2}^{E}$ as discussed in Corollary 1.
$θ_{C, G} \in Θ_{C, G} = R^{d_{C, G}}$ : The parameters for the matrix $C_{1, 2}^{G}$ as discussed in Corollary 2.
$θ_{•} \in Θ_{•, α} \subset R^{d_{•}}$ : The parameters for the stable subsystem in echelon canonical form for Kronecker indices $α_{•}$ .

Example 8.

Consider an I(2) process with

Ω_{S} = ((0, 1, 2))

,

p = {[0, 1, 1]}^{'}

,

n_{•} = 0

and system matrices

\begin{matrix} A = \begin{matrix} [\begin{matrix} 1 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}] \end{matrix}, B = \begin{matrix} [\begin{matrix} - 1 & 2 & - 2 \\ 1 & - 1 & 3 \\ 2 & 0 & 1 \end{matrix}] \end{matrix}, C = \begin{matrix} [\begin{matrix} 0 & - 1 & \frac{1}{\sqrt{2}} \\ \frac{- 1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & \frac{1}{2} \\ \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & \frac{1}{2} \end{matrix}] \end{matrix} . \end{matrix}

In this case,

θ_{B, f, 1} = {[- 1, 2, - 2, - 1, 3, 0, 1]}^{'}

,

θ_{B, p, 1} = {[1, 2]}^{'}

. It follows from

\begin{matrix} R_{3, 1, 2} (\frac{7 π}{4}) R_{3, 1, 3} (\frac{π}{2}) C_{1, 1}^{E} = & {[\begin{matrix} 1 & 0 & 0 \end{matrix}]}^{'}, \\ R_{3, 1, 2} (\frac{7 π}{4}) R_{3, 1, 3} (\frac{π}{2}) C_{1, 2}^{E} = & {[\begin{matrix} 0 & \frac{1}{\sqrt{2}} & \frac{- 1}{\sqrt{2}} \end{matrix}]}^{'} & and R_{2, 1, 2} (\frac{7 π}{4}) [\begin{matrix} \frac{1}{\sqrt{2}} \\ \frac{- 1}{\sqrt{2}} \end{matrix}] = [\begin{matrix} 1 \\ 0 \end{matrix}], \\ R_{3, 1, 2} (\frac{7 π}{4}) R_{3, 1, 3} (\frac{π}{2}) C_{1, 2}^{G} = & {[\begin{matrix} 0 & 1 & 1 \end{matrix}]}^{'} & and R_{2, 1, 2} (\frac{7 π}{4}) [\begin{matrix} 1 \\ 1 \end{matrix}] = [\begin{matrix} 0 \\ \sqrt{2} \end{matrix}], \end{matrix}

that

θ_{C, E} = {[θ_{C, E, 1, 1}^{'}, θ_{C, E, 1, 2}]}^{'} = {[[\frac{π}{2}, \frac{7 π}{4}], [\frac{7 π}{4}]]}^{'}

and

θ_{C, G} = [\sqrt{2}]

.

3.3. The Parameterization in the General Case

Inspecting the canonical form shows that all relevant building blocks are already present in the MFI(1) and the I(2) cases and can be combined to deal with the general case: The entries in

B_{u}

are either unrestricted or follow restrictions according to given structure indices p, and the parameter space is chosen accordingly, as discussed for the MFI(1) and I(2) cases. The restrictions on the matrices

C_{u}

and its blocks

C_{k}

require more sophisticated parameterizations of parts of unitary or orthonormal matrices as well as of orthogonal complements. These are dealt with in Lemmas 1 and 2 and Corollaries 1 and 2 above. The extension of Corollaries 1 and 2 to complex matrices and to matrices which are orthogonal to a larger number of blocks of

C_{k}

is straightforward.

The following theorem characterizes the properties of parameterizations for sets

M_{Γ}

of transfer functions with (general) multi-index

Γ

and describes the relations between sets of transfer functions and the corresponding sets

Δ_{Γ}

of triples

(A, B, C)

of system matrices in canonical form, defined below. Discussing the continuity and differentiability of mappings on sets of transfer functions and on sets of matrix triples also requires the definition of a topology on both sets.

Definition 8.

(i): The set of transfer functions of order n, $M_{n}$ , is endowed with the pointwise topology $T_{p t}$ : First, identify transfer functions with their impulse response sequences. Then, a sequence of transfer functions $k_{i} (z) = I_{s} + \sum_{j = 1}^{\infty} K_{j, i} z^{j}$ converges in $T_{p t}$ to $k_{0} (z) = I_{s} + \sum_{j = 1}^{\infty} K_{j, 0} z^{j}$ if and only if for every $j \in N$ it holds that $K_{j, i} \overset{i \to \infty}{\to} K_{j, 0}$ .
(ii): The set of all triples $(A, B, C)$ in canonical form corresponding to transfer functions with multi-index Γ is called $Δ_{Γ}$ . The set $Δ_{Γ}$ is endowed with the topology corresponding to the distance $d ((A_{1}, B_{1}, C_{1}), (A_{2}, B_{2}, C_{2})) : = ∥ A_{1} - A_{2} ∥_{F r} + ∥ B_{1} - B_{2} ∥_{F r} + {∥ C_{1} - C_{2} ∥}_{F r}$ .

Please note that in the definition of the pointwise topology convergence does not need to be uniform in j and moreover, the power series coefficients do not need to converge to zero for

j \to \infty

and hence the concept can also be used for unstable systems.

Theorem 2.

The set

M_{n}

can be partitioned into pieces

M_{Γ}

, where

Γ : = {Ω_{S}, p, α_{•}}

, i.e.,

\begin{matrix} M_{n} & = & ⋃_{Γ = {Ω_{S}, p, α_{•}} | n_{u} (Ω_{S}) + n_{•} (α_{•}) = n} M_{Γ}, \end{matrix}

where

n_{u} (Ω_{S}) : = \sum_{k = 1}^{l} \sum_{j = 1}^{h_{k}} d_{j}^{k} δ_{k}

, with

δ_{k} = 1

for

ω_{k} \in {0, π}

and

δ_{k} = 2

for

0 < ω_{k} < π

is the state dimension of the unstable subsystem

(A_{u}, B_{u}, C_{u})

with state space unit root structure

Ω_{S}

and

n_{•} (α_{•}) : = \sum_{i = 1}^{s} α_{•, i}

is the state dimension of the stable subsystem with Kronecker indices

α_{•} = (α_{•, 1}, \dots, α_{•, s}), α_{•, i} \in N_{0}

.

For every multi-index Γ there exists a parameter space

Θ_{Γ}

\subset R^{d (Γ)}

for some integer

d (Γ)

, endowed with the Euclidean norm, and a function

ϕ_{Γ} : Δ_{Γ} \to Θ_{Γ}

, such that for every

(A, B, C) \in Δ_{Γ}

the parameter vector

θ : = ϕ_{Γ} (A, B, C) \in Θ_{Γ}

is composed of:

The parameter vector $θ_{B, f} = {[θ_{B, f, 1}^{'}, . . ., θ_{B, f, l}^{'}]}^{'} \in Θ_{B, f} = R^{d_{B, f}}$ , collecting the (real and imaginary parts of) non-restricted entries in $B_{k}, k = 1, \dots, l$ as described in the MFI(1) case.
The parameter vector $θ_{B, p} = {[θ_{B, p, 1}^{'}, . . ., θ_{B, p, l}^{'}]}^{'} \in Θ_{B, p} = R_{+}^{d_{B, p}}$ , collecting the entries in $B_{k}, k = 1, \dots, l,$ restricted by the p.u.t. forms to be positive reals in a similar fashion as described for $B_{1}$ in the I(2) case.
The parameter vector $θ_{C, E} = {[θ_{C, E, 1}^{'}, . . ., θ_{C, E, l}^{'}]}^{'} \in Θ_{C, E} \subset R^{d_{C, E}}$ , $θ_{C, E, k} = {[θ_{C, E, k, 1}^{'}, \dots, θ_{C, E, k, h_{k}}^{'}]}^{'}$ collecting the parameters $θ_{C, E, k, j}$ for all blocks $C_{k, j}^{E}$ , $k = 1, \dots, l$ and $j = 1, \dots, h_{k}$ , obtained using Givens rotations (see Lemmas 1 and 2 and Corollary 1 and its extension to complex matrices).
The parameter vector $θ_{C, G} = {[θ_{C, G, 1}^{'}, . . ., θ_{C, G, l}^{'}]}^{'} \in Θ_{C, G} = R^{d_{C, G}}$ , $θ_{C, G, k} = {[θ_{C, G, k, 2}^{'}, \dots, θ_{C, G, k, h_{k}}^{'}]}^{'}$ collecting the parameters $θ_{C, G, k, j}$ (real and imaginary parts for complex roots) for $C_{k, j}^{G}$ , $k = 1, \dots, l$ and $j = 2, \dots, h_{k}$ , subject to the orthogonality restrictions (see Corollary 2 and its extension to complex matrices).
The parameter vector $θ_{•} \in Θ_{•} \subset R^{d_{•}}$ collecting the free entries in echelon canonical form with Kronecker indices $α_{•}$ .
(i)
The mapping $ψ_{Γ} : M_{Γ} \to Δ_{Γ}$ that attaches a triple $(A, B, C)$ in canonical form to a transfer function in $M_{Γ}$ is continuous. It is the inverse (restricted to $M_{Γ}$ ) of the $T_{p t}$ -continuous function $π : (A, B, C) \mapsto k (z) = I_{s} + z C {(I_{n} - z A)}^{- 1} B$ .
(ii)
Every parameter vector $θ = {[θ_{B, f}^{'}, θ_{B, p}^{'}, θ_{C, E}^{'}, θ_{C, G}^{'}, θ_{•}^{'}]}^{'} \in Θ_{Γ} \subset Θ_{B, f} \times Θ_{B, p} \times Θ_{C, E} \times Θ_{C, G} \times Θ_{•}$ corresponds to a triple $(A (θ), B (θ), C (θ)) \in Δ_{Γ}$ and a transfer function $k (z) = π (A (θ), B (θ), C (θ)) \in M_{Γ}$ . The mapping $ϕ_{Γ}^{- 1} : θ \to (A (θ), B (θ), C (θ))$ is continuous on $Θ_{Γ}$ .
(iii)
For every multi-index Γ the set of points in $Δ_{Γ}$ , where the mapping $ϕ_{Γ}$ is continuous, is open and dense in $Δ_{Γ}$ .

As mentioned in Section 2, the parameterization of

Φ

is straightforward. The

s \times m

entries of

Φ

are collected in a parameter vector

d

. Thus, there is a one-to-one correspondence between state space realizations

(A, B, C, Φ) \in Δ_{Γ} \times R^{s \times m}

and parameter vectors

τ = {[θ^{'}, d^{'}]}^{'} \in Θ_{Γ} \times R^{s m}

. The same holds true for parameters used for the symmetric, positive definite innovation matrix

Σ \in R^{s \times s}

obtained, e.g., from a lower triangular Cholesky factor of

Σ

.

4. The Topological Structure

The parameterization of

M_{n}

in Theorem 2 partitions

M_{n}

into subsets

M_{Γ}

for a selection of multi-indices

Γ

. To every multi-index

Γ

there exists a corresponding associated parameter set

Θ_{Γ}

. Thus, in practical applications, maximizing the pseudo likelihood requires choosing the multi-index

Γ

. Maximizing the pseudo likelihood over the set

M_{Γ}

effectively amounts to including also all elements in the closure of

M_{Γ}

, because of continuity of the parameterization. It is thus necessary to characterize the closures of the sets

M_{Γ}

.

Moreover, maximizing the pseudo likelihood function over all possible multi-indices is time-consuming and not desirable. Fortunately, the results discussed below show that there exists a generic multi-index

Γ_{g}

such that

M_{n} \subset \bar{M_{Γ_{g}}}

. This generic choice corresponds to the set of all stable systems of order n corresponding to the generic neighborhood of the echelon canonical form. This multi-index, therefore, is a natural starting point for estimation.

However, in particular for hypotheses testing, it will be necessary to maximize the pseudo likelihood over sets of transfer functions of order n with specific state space unit root structure

Ω_{S}

, denoted as

M (Ω_{S}, n_{•})

below, where

n_{•}

denotes the dimension of the stable part of the state. We show below that also in this case there exists a generic multi-index

Γ_{g} (Ω_{S}, n_{•})

such that

M (Ω_{S}, n_{•}) \subset \bar{M_{Γ_{g} (Ω_{S}, n_{•})}}

.

The main tool to obtain these results is investigating the properties of the mappings

ψ_{Γ}

, that map transfer functions in

M_{Γ}

to triples

(A, B, C) \in Δ_{Γ}

, as well as analyzing the closures of the sets

Δ_{Γ}

. The relation between parameter vectors

θ \in Θ_{Γ}

and triples of system matrices

(A, B, C) \in Δ_{Γ}

is easier to understand than the relation between

Δ_{Γ}

and

M_{Γ}

, due to the results of Theorem 2. Consequently, this section focuses on the relations between

Δ_{Γ}

and

M_{Γ}

—and their closures—for different multi-indices

Γ

.

To define the closures we embed the sets

Δ_{Γ}

of matrices in canonical form with multi-indices

Γ

corresponding to transfer functions of order n into the space

Δ_{n}

of all conformable complex matrix triples

(A, B, C)

with

A \in C^{n \times n}

, where additionally

λ_{| m a x |} (A) \leq 1

. Since the elements of

Δ_{n}

are matrix triples, this set is isomorphic to a subset of the finite dimensional space

C^{n^{2} + 2 n s}

, equipped with the Euclidean topology. Please note that

Δ_{n}

also contains non-minimal state space realizations, corresponding to transfer functions of lower order.

Remark 16.

In principle the set

Δ_{n}

also contains state space realizations of transfer functions

k (z) = I_{s} + \sum_{j = 1}^{\infty} K_{j} z^{j}

with complex valued coefficients

K_{j}

. Since the subset of

Δ_{n}

of state space systems realizing transfer functions with real valued

K_{j}

is closed in

Δ_{n}

, realizations corresponding to transfer functions with coefficients with non-zero imaginary part are irrelevant for the analysis of the closures of the sets

Δ_{Γ}

.

After investigating the closure of

Δ_{Γ}

in

Δ_{n}

, denoted by

\bar{Δ_{Γ}}

, we consider the set of corresponding transfer functions

π (\bar{Δ_{Γ}})

. Since we effectively maximize the pseudo likelihood over

\bar{Δ_{Γ}}

, we have to understand for which multi-indices

\tilde{Γ}

the set

π (Δ_{\tilde{Γ}})

is a subset of

π (\bar{Δ_{Γ}})

. Moreover, we find a covering of

π (\bar{Δ_{Γ}}) \subset ⋃_{i \in I} M_{Γ_{i}}

. This restricts the set of multi-indices

Γ

that may occur as possible multi-indices of the limit of a sequence in

π (Δ_{Γ})

and thus the set of transfer functions that can be obtained by maximization of the pseudo likelihood.

The sets

M_{Γ}

, are embedded into the vector space M of all causal transfer functions

k (z) = I_{s} + \sum_{j = 1}^{\infty} K_{j} z^{j}

. The vector space M is isomorphic to the infinite dimensional space

Π_{j \in N} R_{j}^{s \times s}

equipped with the pointwise topology. Since, as mentioned above, maximization of the pseudo likelihood function over

M_{Γ}

effectively includes

\bar{M_{Γ}}

, it is important to determine for any given multi-index

Γ

, the multi-indices

\tilde{Γ}

for which the set

M_{\tilde{Γ}}

is a subset of

\bar{M_{Γ}}

. Please note that

\bar{M_{Γ}}

is not necessarily equal to

π (\bar{Δ_{Γ}})

. The continuity of

π

, as shown in Theorem 2 (i), implies the following inclusions:

\begin{matrix} M_{Γ} = π (Δ_{Γ}) \subset π (\bar{Δ_{Γ}}) \subset \bar{M_{Γ}} . \end{matrix}

In general all these inclusions are strict. For a discussion in case of stable transfer functions see Hannan and Deistler (1988, Theorem 2.5.3).

We first define a partial ordering on the set of multi-indices

Γ

. Subsequently we examine the closure

{\bar{Δ}}_{Γ}

in

Δ_{n}

and finally we examine the closures

{\bar{M}}_{Γ}

in M.

Definition 9.

(i): For two state space unit root structures $Ω_{S}$ and ${\tilde{Ω}}_{S}$ with corresponding matrices $A_{u} \in C^{n_{u} \times n_{u}}$ and ${\tilde{A}}_{u} \in C^{{\tilde{n}}_{u} \times {\tilde{n}}_{u}}$ in canonical form, it holds that ${\tilde{Ω}}_{S} \leq Ω_{S}$ if and only if there exists a permutation matrix S such that

$\begin{matrix} S A_{u} S^{'} & = & [\begin{matrix} {\tilde{A}}_{u} & {\tilde{J}}_{12} \\ 0 & {\tilde{J}}_{2} \end{matrix}] . \end{matrix}$

Moreover, ${\tilde{Ω}}_{S} < Ω_{S}$ holds if additionally ${\tilde{Ω}}_{S} \neq Ω_{S}$ .
(ii): For two state space unit root structures $Ω_{S}$ and ${\tilde{Ω}}_{S}$ and dimensions of the stable subsystems $n_{•}, {\tilde{n}}_{•} \in N_{0}$ we define

$\begin{matrix} ({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) & \leq & (Ω_{S}, n_{•}) if and only if {\tilde{Ω}}_{S} \leq Ω_{S}, {\tilde{n}}_{•} \leq n_{•} . \end{matrix}$

Strict inequality holds, if at least one of the two inequalities above holds strictly.
(iii): For two pairs $(Ω_{S}, p)$ and $({\tilde{Ω}}_{S}, \tilde{p})$ with corresponding matrices $A_{u} \in C^{n_{u} \times n_{u}}$ and ${\tilde{A}}_{u} \in C^{{\tilde{n}}_{u} \times {\tilde{n}}_{u}}$ in canonical form, it holds that $({\tilde{Ω}}_{S}, \tilde{p}) \leq (Ω_{S}, p)$ if and only if there exists a permutation matrix S such that

$\begin{matrix} S A_{u} S^{'} = [\begin{matrix} {\tilde{A}}_{u} & {\tilde{J}}_{12} \\ 0 & {\tilde{J}}_{2} \end{matrix}], S p = [\begin{matrix} p_{1} \\ p_{2} \end{matrix}], \end{matrix}$

where $p_{1} \in N_{0}^{{\tilde{n}}_{u}}$ and $\tilde{p}$ restricts at least as many entries as $p_{1}$ , i.e., ${\tilde{p}}_{i} \geq {(p_{1})}_{i}$ holds for all $i = 1, \dots, {\tilde{n}}_{u}$ . Moreover, $({\tilde{Ω}}_{S}, \tilde{p}) < (Ω_{S}, p)$ holds if additionally $({\tilde{Ω}}_{S}, \tilde{p}) \neq (Ω_{S}, p)$ .
(iv): Let $α_{•} = (α_{•, 1}, \dots, α_{•, s}), α_{•, i} \in N_{0}$ and ${\tilde{α}}_{•} = ({\tilde{α}}_{•, 1}, \dots, {\tilde{α}}_{•, s}), {\tilde{α}}_{•, i} \in N_{0}$ . Then ${\tilde{α}}_{•} \leq α_{•}$ if and only if ${\tilde{α}}_{•, i} \leq α_{•, i}, i = 1, \dots, s$ . Moreover, ${\tilde{α}}_{•} < α_{•}$ holds, if at least one inequality is strict (compare Hannan and Deistler 1988, sct. 2.5).

Finally, define

\begin{matrix} \tilde{Γ} = ({\tilde{Ω}}_{S}, \tilde{p}, {\tilde{α}}_{•}) \leq Γ = (Ω_{S}, p, α_{•}) if and only if ({\tilde{Ω}}_{S}, \tilde{p}) \leq (Ω_{S}, p) and {\tilde{α}}_{•} \leq α_{•} . \end{matrix}

Strict inequality holds, if at least one of the inequalities above holds strictly.

Please note that (i) implies that

{\tilde{Ω}}_{S}

only contains unit roots that are also contained in

Ω_{S}

, with the integration orders

{\tilde{h}}_{k}

of the unit roots in

{\tilde{Ω}}_{S}

smaller or equal to the integration orders of the respective unit roots in

Ω_{S}

. Thus, denoting the unit root structures corresponding to

{\tilde{Ω}}_{S}

and

Ω_{S}

by

\tilde{Ω}

and

Ω

, it follows that

{\tilde{Ω}}_{S} \leq Ω_{S}

implies

\tilde{Ω} ⪯ Ω

. The reverse does not hold as, e.g., for

Ω_{S} = ((0, 1, 1))

(where hence

Ω = ((0, 2))

) and

{\tilde{Ω}}_{S} = ((0, 2))

(with

\tilde{Ω} = ((0, 1))

) it holds that

\tilde{Ω} ≺ Ω

, but neither

{\tilde{Ω}}_{S} \leq Ω_{S}

nor

Ω_{S} \leq {\tilde{Ω}}_{S}

holds as here

\begin{matrix} A_{u} = (\begin{matrix} 1 & 1 \\ 0 & 1 \end{matrix}), {\tilde{A}}_{u} = (\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}) . \end{matrix}

This partial ordering is convenient for the characterization of the closure of

Δ_{Γ}

.

4.1. The Closure of $Δ_{Γ}$ in $Δ_{n}$

Please note that the block-structure of

A

implies that every system in

Δ_{Γ}

can be separated in two subsystems

(A_{u}, B_{u}, C_{u})

and

(A_{•}, B_{•}, C_{•})

. Define

Δ_{Ω_{S}, p} : = Δ_{(Ω_{S}, p, {})}

as the set of all state space realizations in canonical form corresponding to state space unit root structure

Ω_{S}

, structure indices p and

n_{•} = 0

. Analogously define

Δ_{α_{•}} : = Δ_{({}, {}, α_{•})}

as the set of all state space realizations in canonical form with

Ω_{S} = {}

and Kronecker indices

α_{•}

. Examining

\bar{Δ_{Ω_{S}, p}}

and

\bar{Δ_{α_{•}}}

separately simplifies the analysis.

4.1.1. The Closure of $Δ_{Ω_{S}, p}$

The canonical form imposes a lot of structure, i.e., restrictions on the matrices

A, B

and

C

. By definition

Δ_{Ω_{S}, p} = Δ_{Ω_{S}, p}^{A} \times Δ_{Ω_{S}, p}^{B} \times Δ_{Ω_{S}, p}^{C}

and the closures of the three matrices can be analyzed separately.

Δ_{Ω_{S}, p}^{A}

and

Δ_{Ω_{S}, p}^{C}

are very easy to investigate. The structure of

A

is fully determined by

Ω_{S}

and consequently

Δ_{Ω_{S}, p}^{A}

consists of a single matrix

A

which immediately implies that

\bar{Δ_{Ω_{S}, p}^{A}} = Δ_{Ω_{S}, p}^{A}

. The matrix

C

, compare Theorem 1 is composed of blocks

C_{k}^{E}

that are sub-blocks of unitary (or orthonormal) matrices and blocks

C_{k}^{G}

that have to fulfill (recursive) orthogonality constraints. The corresponding sets were shown to be closed in Lemmas 1 and 2 and Corollaries 1 and 2. Thus,

\bar{Δ_{Ω_{S}, p}^{C}} = Δ_{Ω_{S}, p}^{C}

.

It remains to discuss

\bar{Δ_{Ω_{S}, p}^{B}}

. The structure indices p defining the p.u.t. structures of the matrices

B_{k}

restrict some entries to be positive. Combining all the parameters—unrestricted with complex values parameterized by real and imaginary part and the positive entries—into a parameter vector leads to an open sub-set of

R^{m}

for some m. For convergent sequences of systems with fixed

Ω_{S}

and p, limits of entries restricted to be positive may be zero. When this happens, two cases have to be distinguished. First, all p.u.t. sub-matrices still have full row rank. In this case the limiting system,

(A_{0}, B_{0}, C_{0})

say, is still minimal and can be transformed to a system in canonical form

({\tilde{A}}_{0}, {\tilde{B}}_{0}, {\tilde{C}}_{0})

with fewer unrestricted entries in

{\tilde{B}}_{0}

.

Second, if at least one of the row ranks of the p.u.t. blocks decreases in the limit, the limiting system is no longer minimal. Consequently,

({\tilde{Ω}}_{S}, \tilde{p}) < (Ω_{S}, p)

in the limit.

To illustrate this point consider again Example 4 with Equation (12) rewritten as

\begin{matrix} x_{t + 1, 1} = x_{t, 1} + x_{t, 2} + B_{1, 1} ε_{t}, x_{t + 1, 2} = x_{t, 2} + B_{1, 2, 1} ε_{t}, x_{t + 1, 3} = x_{t, 3} + B_{1, 2, 2} ε_{t} . \end{matrix}

If

B_{1, 2, 1} = [0, b_{1, 2, 1, 2}] \neq 0

and

B_{1, 2, 2} = [b_{1, 2, 2, 1}, b_{1, 2, 2, 2}] \neq 0

,

b_{1, 2, 2, 1} > 0

, it holds that

{y_{t}}_{t \in Z}

is an I(2) process with state space unit root structure

Ω_{S} = ((0, 1, 2))

.

Now consider a sequence of systems with all parameters except for

b_{1, 2, 1, 2}

constant and

b_{1, 2, 1, 2} \to 0

. The limiting system is then given by

\begin{matrix} y_{t} & = & C_{1, 1}^{E} x_{t, 1} + C_{1, 2}^{G} x_{t, 2} + C_{1, 2}^{E} x_{t, 3} + ε_{t}, \\ \begin{matrix} [\begin{matrix} x_{t + 1, 1} \\ x_{t + 1, 2} \\ x_{t + 1, 3} \end{matrix}] \end{matrix} & = & \begin{matrix} [\begin{matrix} 1 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}] \end{matrix} \begin{matrix} [\begin{matrix} x_{t, 1} \\ x_{t, 2} \\ x_{t, 3} \end{matrix}] \end{matrix} + \begin{matrix} [\begin{matrix} b_{1, 1, 1} & b_{1, 1, 2} \\ 0 & 0 \\ b_{1, 2, 2, 1} & b_{1, 2, 2, 2} \end{matrix}] \end{matrix} ε_{t}, x_{1, 1} = x_{1, 2} = x_{1, 3} = 0 . \end{matrix}

In the limiting system

x_{t, 2} = 0

is redundant and

{y_{t}}_{t \in Z}

is an I(1) process rather than an I(2) process. Dropping

x_{t, 2}

leads to a state space realisation of the limiting system

{y_{t}}_{t \in Z}

given by

\begin{matrix} y_{t} & = & C_{1, 1}^{E} x_{t, 1} + C_{1, 2}^{E} x_{t, 3} + ε_{t} = \tilde{C} {\tilde{x}}_{t} + ε_{t}, {\tilde{x}}_{t} \in R^{2}, \\ {\tilde{x}}_{t + 1} = \begin{matrix} [\begin{matrix} x_{t + 1, 1} \\ x_{t + 1, 3} \end{matrix}] \end{matrix} & = & \begin{matrix} [\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}] \end{matrix} \begin{matrix} [\begin{matrix} x_{t, 1} \\ x_{t, 3} \end{matrix}] \end{matrix} + \begin{matrix} [\begin{matrix} b_{1, 1, 1} & b_{1, 1, 2} \\ b_{1, 2, 2, 1} & b_{1, 2, 2, 2} \end{matrix}] \end{matrix} ε_{t} = {\tilde{x}}_{t} + \tilde{B} ε_{t}, x_{1, 1} = x_{1, 3} = 0 . \end{matrix}

In case

\tilde{B}

has full rank, the above system is minimal. Since

b_{1, 2, 2, 1} > 0

, the matrix

\tilde{B}

needs to be transformed into p.u.t. format. By definition all systems in the sequence, with

b_{1, 2, 1, 2} \neq 0

, have structure indices

p = {[0, 2, 1]}^{'}

as discussed in Example 12. The limiting system—in case of full rank of

\tilde{B}

—has indices

\tilde{p} = {[1, 2]}^{'}

. To relate to Definition 9 choose the permutation matrix

S = \begin{matrix} [\begin{matrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{matrix}] \end{matrix}

to arrive at

\begin{matrix} S A_{u} S^{'} = \begin{matrix} [\begin{matrix} 1 & 0 & 1 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix}] \end{matrix} = \begin{matrix} [\begin{matrix} I_{2} & {\tilde{J}}_{12} \\ 0 & {\tilde{J}}_{2} \end{matrix}] \end{matrix}, S p = \begin{matrix} [\begin{matrix} 0 \\ 1 \\ 2 \end{matrix}] \end{matrix} = \begin{matrix} [\begin{matrix} {(p_{1})}_{1} \\ {(p_{1})}_{2} \\ p_{2} \end{matrix}] \end{matrix} . \end{matrix}

This shows that

{(\tilde{p})}_{i} > {(p_{1})}_{i}

,

i = 1, 2

and thus the limiting system has a smaller multi-index

Γ

than the systems of the sequence. In case

\tilde{B}

has reduced rank equal to one a further reduction in the system order to

n = 1

along similar lines as discussed is possible, again leading to a limiting system with smaller multi-index

Γ

.

The discussion shows that the closure of

Δ_{Ω_{S}, p}^{B}

is related to lower order systems in the sense of Definition 9. The precise statement is given in Theorem 3 after a discussion of the closure of the stable subsystems.

4.1.2. The Closure of $Δ_{α_{•}}$

Consider a convergent sequence of systems

{(A_{j}, B_{j}, C_{j})}_{j \in N}

in

Δ_{α_{•}}

and denote the limiting system by

(A_{0}, B_{0}, C_{0})

. Clearly,

λ_{| \max |} (A_{0}) \leq 1

holds true for the limit

A_{0}

of the sequence

{A_{j}}_{j \in N}

with

λ_{| \max |} (A_{j}) < 1

for all j. Therefore, two cases have to be discussed for the limit:

If $λ_{| \max |} (A_{0}) < 1$ , the potentially non-minimal limiting system $(A_{0}, B_{0}, C_{0})$ corresponds to a minimal state space realization with Kronecker indices smaller or equal to $α_{•}$ (cf. Hannan and Deistler 1988, Theorem 2.5.3).
If $λ_{| \max |} (A_{0}) = 1$ , the limiting matrix $A_{0}$ is similar to a block matrix $\tilde{A} = diag ({\tilde{J}}_{2}, {\tilde{A}}_{•})$ , where all eigenvalues of ${\tilde{J}}_{2}$ have unit modulus and $λ_{| \max |} ({\tilde{A}}_{•}) < 1$ .

The first case is well understood, compare Hannan and Deistler (1988, chp. 2), since the limit in this case corresponds to a stable transfer function. In the second case the limiting system can be separated into two subsystems

({\tilde{J}}_{2}, {\tilde{B}}_{u}, {\tilde{C}}_{u})

and

({\tilde{A}}_{•}, {\tilde{B}}_{•}, {\tilde{C}}_{•})

, according to the block diagonal structure of

\tilde{A}

. The state space unit root structure of the limiting system

(A_{0}, B_{0}, C_{0})

depends on the multiplicities of the eigenvalues of the matrix

{\tilde{J}}_{2}

and is greater (in the sense of Definition 9) than the empty state space unit root structure. At the same time the Kronecker indices of the subsystem

({\tilde{A}}_{•}, {\tilde{B}}_{•}, {\tilde{C}}_{•})

are smaller than

α_{•}

, compare again Hannan and Deistler (1988, chp. 2). Since the Kronecker indices impose restrictions on some entries of the matrices

A_{j}

and thus also on

A_{0}

, the block

{\tilde{J}}_{2}

and consequently also the limiting state space unit root structure might be subject to further restrictions.

4.1.3. The Conformable Index Set and the Closure of $Δ_{Γ}$

The previous subsection shows that the closure of

Δ_{Γ}

does not only contain systems corresponding to transfer functions with multi-index smaller or equal to

Γ

, but also systems that are related in a different way that is formalized below.

Definition 10 (Conformable index set).

Given a multi-index

Γ = (Ω_{S}, p, α_{•})

, the set of conformable multi-indices

K (Γ)

contains all multi-indices

\tilde{Γ} = ({\tilde{Ω}}_{S}, \tilde{p}, {\tilde{α}}_{•})

, where:

The pair $({\tilde{Ω}}_{S}, \tilde{p})$ with corresponding matrix ${\tilde{A}}_{u}$ in canonical form extends $(Ω_{S}, p)$ with corresponding matrix $A_{u}$ in canonical form, i.e., there exists a permutation matrix S such that

$\begin{matrix} S {\tilde{A}}_{u} S^{'} = [\begin{matrix} A_{u} & 0 \\ 0 & {\tilde{J}}_{2} \end{matrix}] and S \tilde{p} = [\begin{matrix} p \\ {\tilde{p}}_{2} \end{matrix}], \end{matrix}$
${\tilde{α}}_{•} \leq α_{•}$ .
${\tilde{n}}_{u} + {\tilde{n}}_{•} = n_{u} + n_{•}$ .

Please note that the definition implies

Γ \in K (Γ)

. The importance of the set

K (Γ)

is clarified in the following theorem:

Theorem 3.

Transfer functions corresponding to state space realizations with multi-index

\tilde{Γ} \leq Γ

are contained in the set

π (\bar{Δ_{Γ}})

. The set

π (\bar{Δ_{Γ}})

is contained in the union of all sets

M_{\overset{ˇ}{Γ}}

for

\overset{ˇ}{Γ} \leq \tilde{Γ}

with

\tilde{Γ}

conformable to Γ, i.e.,

\begin{matrix} ⋃_{\tilde{Γ} \leq Γ} M_{\tilde{Γ}} \subset π (\bar{Δ_{Γ}}) \subset ⋃_{\tilde{Γ} \in K (Γ)} ⋃_{\overset{ˇ}{Γ} \leq \tilde{Γ}} M_{\overset{ˇ}{Γ}} . \end{matrix}

Theorem 3 provides a characterization of the transfer functions corresponding to systems in the closure of

Δ_{Γ}

. The conformable set

K (Γ)

plays a key role here, since it characterizes the set of all minimal systems that can be obtained as limits of convergent sequences from within the set

Δ_{Γ}

. Conformable indices extend the matrix

A_{u}

corresponding to the unit root structure by the block

{\tilde{J}}_{2}

.

The second inclusion in Theorem 3 is potentially strict, depending on the Kronecker indices

α_{•}

in

Γ

. Equality holds, e.g., in the following case:

Corollary 3.

For every multi-index Γ with

n_{•} = 0

the set of conformable indices consists only of Γ, which implies

π (\bar{Δ_{Γ}}) = ⋃_{\tilde{Γ} \leq Γ} M_{\tilde{Γ}}

.

4.2. The Closure of $M_{Γ}$

It remains to investigate the closure of

M_{Γ}

in M. Hannan and Deistler (1988, Theorem 2.6.5 (ii) and Remark 3, p. 73) show that for any order n, there exist Kronecker indices

α_{•, g} = α_{•, g} (n)

corresponding to the generic neighborhood

M_{α_{•, g}}

for transfer functions of order n such that

\begin{matrix} M_{•, n} & : = & ⋃_{α_{•} | n_{•} (α_{•}) = n} M_{α_{•}} \subset \bar{M_{α_{•, g}}}, \end{matrix}

where

M_{α_{•}} : = π (Δ_{α_{•}})

. Here

M_{•, n}

denotes the set of all transfer functions of order n with state space realizations

(A, B, C)

satisfying

λ_{| \max |} (A) < 1

. Every transfer function in

M_{•, n}

can be approximated by a sequence of transfer functions in

M_{α_{•, g}}

.

It can be easily seen that a generic neighborhood also exists for systems with state space unit root structure

Ω_{S}

and without stable subsystem: Set the structure indices p to have a minimal number of elements restricted in p.u.t. sub-blocks of

B_{u}

, i.e., for any block

B_{k, h_{k}, j} \in C^{n_{k, h_{k}, j} \times s}

, or

B_{k, h_{k}, j} \in R^{n_{k, h_{k}, j} \times s}

in case of a real unit root, set the corresponding structure indices to

p = [1, \dots, n_{k, h_{k}, j}]

. Any p.u.t. matrix can be approximated by a matrix in this generic neighborhood with some positive entries restricted by the p.u.t. structure tending to zero. Combining these results with Theorem 3 implies the existence of a generic neighborhood for the canonical form considered in this paper:

Theorem 4.

Let

M (Ω_{S}, n_{•})

be the set of all transfer functions

k (z) \in M_{n_{u} (Ω_{S}) + n_{•}}

with state space unit root structure

Ω_{S}

. For every

Ω_{S}

and

n_{•}

, there exists a multi-index

Γ_{g} : = Γ_{g} (Ω_{S}, n_{•})

such that

\begin{matrix} M (Ω_{S}, n_{•}) & \subset & \bar{M_{Γ_{g}}} . \end{matrix}

(14)

Moreover, it holds that

M (Ω_{S}, n_{•}) \subset \bar{M_{α_{•, g} (n)}}

for every

Ω_{S}

and

n_{•}

satisfying

n_{u} (Ω_{S}) + n_{•} \leq n

.

Theorem 4 is the basis for choosing a generic multi-index

Γ

for maximizing the pseudo likelihood function. For every

Ω_{S}

and

n_{•}

there exists a generic piece that—in its closure—contains all transfer functions of order

n_{u} (Ω_{S}) + n_{•}

and state space unit root structure

Ω_{S}

: The set of transfer functions corresponding to the multi-index with the largest possible structure indices p in the sense of Definition 9 (iii) and generic Kronecker indices for the stable subsystem. Choosing these sets and their corresponding parameter spaces as model sets is, therefore, the most convenient choice for numerical maximization, if only

Ω_{S}

and

n_{•}

are known.

If, e.g., only an upper bound for the system order n is known and the goal is only to obtain consistent estimators, using

α_{•, g} (n)

is a feasible choice, since all transfer functions in the closure of the set

M_{α_{•, g} (n)}

can be approximated arbitrarily well, regardless of their potential state space unit root structure

Ω_{S}

,

n_{u} (Ω_{S}) \leq n

. For testing hypotheses, however, it is important to understand the topological relations between sets corresponding to different multi-indices

Γ

. In the following we focus on the multi-indices

Γ_{g} (Ω_{S}, n_{•})

for arbitrary

Ω_{S}

and

n_{•}

.

The closure of

M (Ω_{S}, n_{•})

contains also transfer functions that have a different state space unit root structure than

Ω_{S}

. Considering convergent sequences of state space realizations

{(A_{j}, B_{j}, C_{j})}_{j \in N}

of transfer functions in

M (Ω_{S}, n_{•})

, the state space unit root structure of

(A_{0}, B_{0}, C_{0}) : = {lim}_{j \to \infty} (A_{j}, B_{j}, C_{j})

may differ in three ways:

For sequences ${(A_{j}, B_{j}, C_{j})}_{j \in N}$ in canonical form rows of $B_{u, j}$ can tend to zero, which reduces the state space unit root structure as discussed in Section 4.1.1.
Stable eigenvalues of $A_{j}$ may converge to the unit circle, thereby extending the unit root structure.
Off-diagonal entries of the sub-block $A_{u, j}$ of $A_{j} = T_{j} A_{j} T_{j}^{- 1}$ may be converging to zeros in the sub-block $A_{u, 0}$ of the limit $A_{0} = T_{0} A_{0} T_{0}^{- 1}$ in canonical form, resulting in a different attainable state space unit root structure. Here $T_{j} \in C^{n \times n}$ for all $j \in N$ are regular matrices transforming $A_{j}$ to canonical form and $T_{0} \in C^{n \times n}$ transforms $A_{0}$ accordingly.

The first change of

Ω_{S}

described above results in a transfer function with smaller state space unit root structure according to Definition 9 (ii). The implications of the other two cases are summarized in the following definition:

Definition 11 (Attainable unit root structures).

For given

n_{•}

and

Ω_{S}

the set

A (Ω_{S}, n_{•})

of attainable unit root structures contains all pairs

({\tilde{Ω}}_{S}, {\tilde{n}}_{•})

, where

{\tilde{Ω}}_{S}

with corresponding matrix

{\tilde{A}}_{u}

in canonical form extends

Ω_{S}

with corresponding matrix

A_{u}

in canonical form, i.e., there exists a permutation matrix S such that

\begin{matrix} S {\tilde{A}}_{u} S^{'} & = & [\begin{matrix} {\overset{ˇ}{A}}_{u} & J_{12} \\ 0 & J_{2} \end{matrix}], \end{matrix}

where

{\overset{ˇ}{A}}_{u}

can be obtained by replacing off-diagonal entries in

A_{u}

by zeros and where

{\tilde{n}}_{•} : = n_{•} - d_{J}

with

d_{J}

the dimension of

J_{2} \in C^{d_{J} \times d_{J}}

.

Remark 17.

It is a direct consequence of the definition of

A (Ω_{S}, n_{•})

that

({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) \in A (Ω_{S}, n_{•})

implies

A ({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) \subset A (Ω_{S}, n_{•})

.

Theorem 5.

(i): $M_{Γ}$ is $T_{p t}$ -open in $\bar{M_{Γ}}$ (see Definition 8 for a definition of $T_{p t}$ ).
(ii): For every generic multi-index $Γ_{g}$ corresponding to $Ω_{S}$ and $n_{•}$ it holds that

\begin{matrix} π (\bar{Δ_{Γ_{g}}}) & \subset & ⋃_{\tilde{Γ} \in K (Γ_{g})} ⋃_{\overset{ˇ}{Γ} \leq \tilde{Γ}} M_{\overset{ˇ}{Γ}} \\ \subset & ⋃_{({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) \in A (Ω_{S}, n_{•})} ⋃_{({\overset{ˇ}{Ω}}_{S}, {\overset{ˇ}{n}}_{•}) \leq ({\tilde{Ω}}_{S}, {\tilde{n}}_{•})} M ({\overset{ˇ}{Ω}}_{S}, {\overset{ˇ}{n}}_{•}) = \bar{M_{Γ_{g}}} . \end{matrix}

Theorem 5 has important consequences for statistical analysis, e.g., PML estimation, since—as stated several times already—maximizing the pseudo likelihood function over

Θ_{Γ}

effectively amounts to calculating the supremum over the larger set

\bar{M_{Γ}}

. Depending on the choice of

Γ

the following asymptotic behavior may occur:

If $Γ$ is chosen correctly and the estimator of the transfer function is consistent, openness of $M_{Γ}$ in its closure implies that the probability of the estimator being an interior point of $M_{Γ}$ tends to one asymptotically. Since the mapping attaching the parameters to the transfer function is continuous on an open and dense set, consistency in terms of transfer functions, therefore, implies generic consistency of the parameter estimators.
If the multi-index is incorrectly chosen to equal $Γ$ , estimator consistency is still possible if the true multi-index $Γ_{0} < Γ$ , as in this case $M_{Γ_{0}} \subset {\bar{M}}_{Γ}$ . This is in some sense not too surprising and something that is also well-known in the simpler VAR framework where consistency of OLS can be established when the true autoregressive order is smaller than the order chosen for estimation. Analogous to the lag number in the VAR case, thus, a necessary condition for consistency is to choose the system order larger or equal to the true system order.

Finally, note that Theorem 5 also implies the following result relevant for the determination of the unit root structure, further discussed in Section 5.1.1 and Section 5.2.1:

Corollary 4.

For every pair

({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) \in A (Ω_{S}, n_{•})

it holds that

\begin{matrix} \bar{M ({\tilde{Ω}}_{S}, {\tilde{n}}_{•})} & \subset & \bar{M (Ω_{S}, n_{•})} . \end{matrix}

5. Testing Commonly Used Hypotheses in the MFI(1) and I(2) Cases

This section discusses a large number of hypotheses, respectively restrictions, on cointegrating spaces, adjustment coefficients and deterministic components often tested in the empirical literature. As with the VECM framework, as discussed for the I(2) case in Section 2, testing hypotheses on the cointegrating spaces or adjustment coefficients may necessitate different reparameterizations.

5.1. The $M F I (1)$ Case

The two by far most widely used cases of MFI(1) processes are

I (1)

processes and seasonally (co-)integrated processes for quarterly data with state space unit root structure

((0, d_{1}^{1}), (π / 2, d_{1}^{2}), (π, d_{1}^{3}))

. In general, assuming for notational simplicity

ω_{1} = 0

and

ω_{l} = π

, it holds that for

t > 0

and

x_{1, u} = 0

we have

\begin{matrix} y_{t} & = & \sum_{k = 1}^{l} C_{k, R} x_{t, k, R} + C_{•} x_{t, •} + Φ d_{t} + ε_{t} \\ = & C_{1} x_{t, 1} + \sum_{k = 2}^{l - 1} (C_{k} x_{t, k} + {\bar{C}}_{k} {\bar{x}}_{t, k}) + C_{l} x_{t, l}^{j} + C_{•} x_{t, •} + Φ d_{t} + ε_{t} \\ = & C_{1} B_{1} \sum_{j = 1}^{t - 1} ε_{t - j} + 2 \sum_{k = 2}^{l - 1} R (C_{k} B_{k} \sum_{j = 1}^{t - 1} {(\bar{z_{k}})}^{j - 1} ε_{t - j}) + C_{l} B_{l} \sum_{j = 1}^{t - 1} {(- 1)}^{j - 1} ε_{t - j} \\ + C_{•} \sum_{j = 1}^{t - 1} A_{•}^{j - 1} B_{•} ε_{t - j} + C_{•} A_{•}^{t - 1} x_{1, •} + Φ d_{t} + ε_{t} \\ = & C_{1} B_{1} \sum_{j = 1}^{t - 1} ε_{t - j} + 2 \sum_{k = 2}^{l - 1} \sum_{j = 1}^{t - 1} (R (C_{k} B_{k}) cos (ω_{k} (j - 1)) + I (C_{k} B_{k}) sin (ω_{k} (j - 1))) ε_{t - j} \\ + C_{l} B_{l} \sum_{j = 1}^{t - 1} {(- 1)}^{j - 1} ε_{t - j} + C_{•} \sum_{j = 1}^{t - 1} A_{•}^{j - 1} B_{•} ε_{t - j} + C_{•} A_{•}^{t - 1} x_{1, •} + Φ d_{t} + ε_{t} . \end{matrix}

The above equation provides an additive decomposition of

{y_{t}}_{t \in Z}

into stochastic trends and cycles, the deterministic and stationary components. The stochastic cycles at frequency

0 < ω_{k} < π

are, of course, given by the combination of sine and cosine terms. For the MFI(1) case this can also be seen directly from considering the real valued canonical form discussed in Remark 4, with the matrices

A_{k, R}

for

k = 2, \dots, l - 1

, given by

A_{k, R} = I_{d_{1}^{k}} \otimes (\begin{matrix} cos (ω_{k}) & - sin (ω_{k}) \\ sin (ω_{k}) & cos (ω_{k}) \end{matrix})

in this case.

The ranks of

C_{k} B_{k}

are equal to the integers

d_{1}^{k}

in

Ω_{S} = ((ω_{1}, d_{1}^{1}), \dots, (ω_{l}, d_{1}^{l}))

. The number of stochastic trends is equal to

d_{1}^{1}

, the number of stochastic cycles at frequency

ω_{k}

is equal to

2 d_{1}^{k}

for

k = 2, \dots, l - 1

and equal to

d_{1}^{l}

if

k = l

, as discussed in Section 3.

Moreover, in the MFI(1) case,

d_{1}^{k}

is linked to the complex cointegrating rank

r_{k}

at frequency

ω_{k}

, defined in Johansen (1991) and Johansen and Schaumburg (1999) in the VECM case as the rank of the matrix

Π_{k} : = - a (z_{k})

. For VARMA processes with arbitrary integration orders the complex cointegrating rank

r_{k}

at frequency

ω_{k}

is

r_{k} : = rank (- k^{- 1} (z_{k}))

, where

k (z)

is the transfer function, with

r_{k} = s - d_{1}^{k}

in the MFI(1) case. Thus, in the MFI(1) case, determination of the state space unit root structure corresponds to determination of the complex cointegrating ranks in the VECM case.

In the VECM setting, the matrix

Π_{k}

is usually factorized into

Π_{k} = α_{k} β_{k}^{'}

, as presented for the I(1) case in Section 2. For

ω_{k} = {0, π}

the column space of

β_{k}

gives the cointegrating space of the process at frequency

ω_{k}

. For

0 < ω_{k} < π

the relation between the column space of

β_{k}

and the space of CIVs and PCIVs at the corresponding frequency is more involved. The columns of

β_{k}

are orthogonal to the columns of

C_{k}

, the sub-block of

C

from a state space realization

(A, B, C)

in canonical form corresponding to the VAR process. Analogously, the column space of the matrix

α_{k}

, containing the so-called adjustment coefficients, is orthogonal to the row space of the sub-block

B_{k}

of

B

.

Both integers

d_{1}^{k}

and

r_{k}

are related to the dimensions of the static and dynamic cointegrating spaces in the MFI(1) case: For

ω_{k} \in {0, π}

, the cointegrating rank

r_{k} = s - d_{1}^{k}

coincides with the dimension of the static cointegrating space at frequency

ω_{k}

. Furthermore, the dimension of the static cointegrating space at frequency

0 < ω_{k} < π

is bounded from above by

r_{k} = s - d_{1}^{k}

, since it is spanned by at most

s - d_{1}^{k}

vectors

β \in R^{s}

orthogonal to the complex valued matrix

C_{k}

. The dimension of the dynamic cointegrating space at

0 < ω_{k} < π

is equal to

2 r_{k} = 2 (s - d_{1}^{k})

. Identifying again

β (z) = β_{0} + β_{1} z

with the vector

{[β_{0}^{'}, β_{1}^{'}]}^{'}

, a basis of the dynamic cointegrating space at

0 < ω_{k} < π

is then given by the column space of the product

\begin{matrix} [\begin{matrix} γ_{0} & {\tilde{γ}}_{0} \\ γ_{1} & {\tilde{γ}}_{1} \end{matrix}] : = [\begin{matrix} I_{s} & 0_{s \times s} \\ - cos (ω_{k}) I_{s} & sin (ω_{k}) I_{s} \end{matrix}] [\begin{matrix} R (β_{k}) & I (β_{k}) \\ - I (β_{k}) & R (β_{k}) \end{matrix}], \end{matrix}

with the columns of

β_{k} \in C^{s \times (s - d_{1}^{k})}

spanning the orthogonal complement of the column space of

C_{k}

, i.e.,

β_{k}

is of full rank and

β_{k}^{'} C_{k} = (R {(β_{k})}^{'} - i I {(β_{k})}^{'}) C_{k} = 0

. This holds true, since both factors are of full rank and

{[γ_{0}^{'}, γ_{1}^{'}]}^{'}

satisfies

({\bar{z}}_{k} γ_{0}^{'} + γ_{1}^{'}) C_{k} = 0

, which corresponds to the necessary condition given in Example 2 for the columns of

{[γ_{0}^{'}, γ_{1}^{'}]}^{'}

to be PCIVs. The latter implies

({\bar{z}}_{k} {\tilde{γ}}_{0}^{'} + {\tilde{γ}}_{1}^{'}) C_{k} = 0

also for

{[{\tilde{γ}}_{0}^{'}, {\tilde{γ}}_{1}^{'}]}^{'}

, highlighting again the additional structure of the cointegrating space emanating from the complex conjugate pairs or eigenvalues (and matrices) as discussed in Example 2.

Please note that the relations between

r_{k}

and

d_{1}^{k}

discussed above only hold in the MFI(1) and I(1) special cases. For higher orders of integration no such simple relations exist.

In the MFI(1) setting the deterministic component typically includes a constant, seasonal dummies and a linear trend. As discussed in Remark 6, a sufficiently rich set of deterministic components allows to absorb non-zero initial values

x_{1, u}

.

5.1.1. Testing Hypotheses on the State Space Unit Root Structure

Using the generic sets of transfer functions

M_{Γ_{g}}

presented in Theorem 4, we can construct pseudo likelihood ratio tests for different hypotheses

H_{0} : (Ω_{S}, n_{•}) = (Ω_{S, 0}, n_{•, 0})

against chosen alternatives. Note, however, that by the results of Theorem 5 the null hypothesis includes all pairs

(Ω_{S}, n_{•}) \in A (Ω_{S, 0}, n_{•, 0})

as well as all pairs

(Ω_{S}, n_{•})

that are smaller than a pair

({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) \in A (Ω_{S, 0}, n_{•, 0})

.

As common in the VECM setting, first consider hypotheses at a single frequency

ω_{k}

. For an MFI(1) process, the hypothesis of a state space unit root structure equal to

Ω_{S, 0} = ((ω_{k}, d_{1, 0}^{k}))

corresponds to the hypothesis of the (compex) cointegrating rank

r_{k}

at frequency

ω_{k}

being equal to

r_{0} = s - d_{1, 0}^{k}

. Maximization of the pseudo likelihood function over the set

\bar{M (((ω_{k}, d_{1, 0}^{k})), n - δ_{k} d_{1, 0}^{k})}

– with a suitably chosen order n—leads to estimates that may be arbitrary close to transfer functions with different state space unit root structures

Ω_{S}

. These include

Ω_{S}

with additional unit root frequencies

ω_{\tilde{k}}

, with the integers

d_{1}^{\tilde{k}}

restricted only by the order n. Therefore, focusing on a single frequency

ω_{k}

does not rule out a more complicated true state space unit root structure. Assume

n \geq δ_{k} s

with

δ_{k} = 1

for

ω_{k} \in {0, π}

and

δ_{k} = 2

else. Corollary 4 shows that

\begin{matrix} \bar{M ({}, n)} \supset \bar{M (((ω_{k}, 1)), n - δ_{k})} \supset \dots \supset \bar{M (((ω_{k}, s)), n - s δ_{k})} \end{matrix}

since, e.g.,

(((ω_{k}, 1)), n - δ_{k}) \in A ({}, n)

.

Analogously to the procedure of testing for the complex cointegrating rank

r_{k}

in the VECM setting, these inclusions can be employed to test for

d_{1}^{k}

: Start with the hypothesis of

d_{1}^{k} = s

against the alternative of

0 \leq d_{1}^{k} < s

and decrease the assumed

d_{1}^{k}

consecutively until the test does not reject the null hypothesis.

Furthermore, one can formulate hypotheses on

d_{1}^{k}

jointly at different frequencies

ω_{k}

. Again, there exist inclusions based on the definition of the set of attainable state space unit root structures and Corollary 4, which can be used to consecutively test hypotheses on

Ω_{S}

.

5.1.2. Testing Hypotheses on CIVs and PCIVs

Johansen (1995) considers in the

I (1)

case three types of hypotheses on the cointegrating space spanned by the columns of

β

that are each motivated by examples from economic research: The different cases correspond to different types of hypotheses related to restrictions implied by economic theory.

(i): $H_{0} : β = H φ, β \in R^{s \times r}, H \in R^{s \times t}, φ \in R^{t \times r}, r \leq t < s$ : The cointegrating space is known to be a subspace of the column space of H (which is of full column rank).
(ii): $H_{0}^{'} : β = [b, φ], β \in R^{s \times r}, b \in R^{s \times t}, φ \in R^{s \times r - t}, 0 < t \leq r$ : Some cointegrating relations are known.
(iii): $H_{0}^{^{''}} : β = [H_{1} φ_{1}, \dots, H_{c} φ_{c}], β \in R^{s \times r}, H_{j} \in R^{s \times t_{j}}, φ_{j} \in R^{t_{j} \times r_{j}}, r_{j} \leq t_{j} \leq s$ , for $j = 1, \dots, c$ such that $\sum_{j = 1}^{c} r_{j} = r$ . Cointegrating relations are known to be in the column spaces of matrices $H_{k}$ (which are of full column rank).

As discussed in Example 1, cointegration at

ω_{k} = 0

occurs if and only if a vector

β_{j}

satisfies

β_{j}^{'} C_{1} = 0

. In other words, the column space of

C_{1}

is the orthocomplement of the cointegrating space spanned by the columns of

β

and hypotheses on

β

restrict entries of

C_{1}

.

The first type of hypothesis,

H_{0}

, implies that the column space of

C_{1}

is equal to the orthocomplement of the column space of

H φ

. Assume w.l.o.g.

H \in O_{s, t}

,

φ_{⊥} \in O_{t, t - r}

and

H_{⊥} \in O_{s, s - t}

, such that the columns of

[H φ_{⊥}, H_{⊥}]

form an orthonormal basis for the orthocomplement of the cointegrating space. Consider now the mapping:

\begin{matrix} C_{1}^{r} ({\overset{ˇ}{θ}}_{L}, θ_{R}) & : = & [\begin{matrix} H \cdot {\overset{ˇ}{R}}_{L} {({\overset{ˇ}{θ}}_{L})}^{'} [\begin{matrix} I_{t - r} \\ 0_{r \times (t - r)} \end{matrix}], & H_{⊥} \end{matrix}] \cdot R_{R} (θ_{R}), \end{matrix}

(15)

where

{\overset{ˇ}{R}}_{L} ({\overset{ˇ}{θ}}_{L}) : = \prod_{i = 1}^{t - r} \prod_{j = 1}^{r} R_{t, i, t - r + j} (θ_{L, r (i - 1) + j}) \in R^{t \times t}

and

R_{R} (θ_{R}) \in R^{(s - r) \times (s - r)}

as in Lemma 1. From this one can derive a parameterization of the set of matrices

C_{1}^{r}

corresponding to

H_{0}

, analogously to Lemma 1. The difference of the number of free parameters under the null hypothesis and under the alternative is the difference between the number of free parameters in

θ_{L} \in {[0, 2 π)}^{r (s - r)}

and

{\overset{ˇ}{θ}}_{L} \in {[0, 2 π)}^{r (t - r)}

, implying a reduction of the number of free parameters of

r (s - t)

under the null hypothesis. This necessarily coincides with the number of degrees of freedom of the corresponding test statistic in the VECM setting (cf. Johansen 1995, Theorem 7.2).

The second type of hypothesis,

H_{0}^{'}

, is also straightforwardly parameterized: In this case a subspace of the cointegrating space is known and given by the column space of

b \in R^{s \times t}

. Assume w.l.o.g.

b \in O_{s, t}

. The orthocomplement of

β = [b, φ]

is given by the set of matrices

C_{1}

satisfying the restriction

b^{'} C_{1} = 0

, i.e., the set

O_{s, d_{1}} (b)

defined in (13). The parameterization of this set has already been discussed. The reduction of the number of free parameters under the null hypothesis is

t (s - r)

which again coincides with the number of degrees of freedom of the corresponding test statistic in the VECM setting (cf. Johansen 1995, Theorem 7.3).

Finally, the third type of hypothesis,

H_{0}^{''}

, is the most difficult to parameterize in our setting. As an illustrative example consider the case

H_{0}^{^{''}} : β = [H_{1} φ_{1}, H_{2} φ_{2}], β \in R^{s \times r}, H_{1} \in R^{s \times t_{1}}, H_{2} \in R^{s \times t_{2}}, φ_{1} \in R^{t_{1} \times r_{1}}, φ_{2} \in R^{t_{2} \times r_{2}}, r_{j} \leq t_{j} \leq s

and

r_{1} + r_{2} = r

. W.l.o.g. choose

H_{b} \in O_{s, t_{b}}

such that its columns span the

t_{b}

-dimensional intersection of the column spaces of

H_{1}

and

H_{2}

and choose

{\tilde{H}}_{j} \in O_{s, {\tilde{t}}_{j}} (H_{b}), j = 1, 2

such that the columns of

{\tilde{H}}_{j}

and

H_{b}

span the column space of

H_{j}

. Define

\tilde{H} : = [{\tilde{H}}_{1}, {\tilde{H}}_{2}, H_{b}] \in O_{s, \tilde{t}}

, with

\tilde{t} = {\tilde{t}}_{1} + {\tilde{t}}_{2} + t_{b}

. Let w.l.o.g.

{\tilde{H}}_{⊥} \in O_{s, s - \tilde{t}} (\tilde{H})

and define

p_{j} : = min (r_{j}, {\tilde{t}}_{j})

,

q_{j} : = max (r_{j}, {\tilde{t}}_{j})

for

j = 1, 2

and

p_{b} = q_{1} - {\tilde{t}}_{1} + q_{2} - {\tilde{t}}_{2}

. A parameterization of

β^{r} \in O_{s, r}

satisfying the restrictions under the null hypothesis can be derived from the following mapping:

\begin{matrix} β^{r} (θ_{H}, θ_{R, β}) & : = & \tilde{H} \cdot R_{H} {(θ_{H})}^{'} [\begin{matrix} I_{p_{1}} & 0_{p_{1} \times p_{2}} & 0_{p_{1} \times p_{b}} \\ 0_{(q_{1} - r_{1}) \times p_{1}} & 0_{(q_{1} - r_{1}) \times p_{2}} & 0_{(q_{1} - r_{1}) \times p_{b}} \\ 0_{p_{2} \times p_{1}} & I_{p_{2}} & 0_{p_{2} \times p_{b}} \\ 0_{(q_{2} - r_{2}) \times p_{1}} & 0_{(q_{2} - r_{2}) \times p_{2}} & 0_{(q_{2} - r_{2}) \times p_{b}} \\ 0_{p_{b} \times p_{1}} & 0_{p_{b} \times p_{2}} & I_{p_{b}} \\ 0_{(\tilde{t} - q_{1} - q_{2}) \times p_{1}} & 0_{(\tilde{t} - q_{1} - q_{2}) \times p_{2}} & 0_{(\tilde{t} - q_{1} - q_{2}) \times p_{b}} \end{matrix}] \cdot R_{R} (θ_{R, β}), \end{matrix}

where

R_{R} (θ_{R, β}) \in R^{r \times r}

as in Lemma 1 and

R_{H} (θ_{H}) : = R_{H} ((θ_{H_{1}}, θ_{H_{2}}, θ_{H_{b}})) : = R_{H_{1}} (θ_{H_{1}}) R_{H_{2}} (θ_{H_{2}}) R_{H_{b}} (θ_{H_{b}}) \in R^{\tilde{t} \times \tilde{t}}

is a product of Givens rotations corresponding to the entries in the blocks highlighted by bold font. The three matrices are defined as follows:

\begin{matrix} R_{H_{1}} (θ_{H_{1}}) & : = & \prod_{i = 1}^{p_{1}} \prod_{j = 1}^{\tilde{t} - q_{2} - r_{1}} R_{t, i, δ_{H_{1}} (j) + j} (θ_{H_{1}, (\tilde{t} - q_{2} - r_{1}) (i - 1) + j}), δ_{H_{1}} (j) : = \{\begin{matrix} p_{1} & if j \leq q_{1} - r_{1} \\ {\tilde{t}}_{1} + {\tilde{t}}_{2} + p_{b} & else, \end{matrix} \\ R_{H_{2}} (θ_{H_{2}}) & : = & \prod_{i = 1}^{p_{2}} \prod_{j = 1}^{\tilde{t} - q_{1} - r_{2}} R_{t, p_{1} + i, δ_{H_{2}} (j) + j} (θ_{H_{2}, (\tilde{t} - q_{1} - r_{2}) (i - 1) + j}), δ_{H_{2}} (j) : = \{\begin{matrix} {\tilde{t}}_{1} + p_{2} & if j \leq q_{2} - r_{2} \\ {\tilde{t}}_{1} + {\tilde{t}}_{2} + p_{b} & else, \end{matrix} \\ R_{H_{b}} (θ_{H_{b}}) & : = & \prod_{i = 1}^{p_{b}} \prod_{j = 1}^{\tilde{t} - q_{1} - q_{2}} R_{t, p_{1} + p_{2} + i, {\tilde{t}}_{1} + {\tilde{t}}_{2} + p_{b} + j} (θ_{H_{b}, (\tilde{t} - q_{1} - q_{2}) (i - 1) + j}) . \end{matrix}

Consequently, a parameterization of the orthocomplement of the cointegrating space is based on the mapping:

\begin{matrix} C_{1}^{r} (θ_{H}, θ_{R, C}) : = \\ [\begin{matrix} \tilde{H} \cdot R_{H} {(θ_{H})}^{'} [\begin{matrix} 0_{p_{1} \times (q_{1} - r_{1})} & 0_{p_{1} \times (q_{2} - r_{2})} & 0_{p_{1} \times (\tilde{t} - q_{1} - q_{2})} \\ I_{q_{1} - r_{1}} & 0_{(q_{1} - r_{1}) \times (q_{2} - r_{2})} & 0_{(q_{1} - r_{1}) \times (\tilde{t} - q_{1} - q_{2})} \\ 0_{p_{2} \times (q_{1} - r_{1})} & 0_{p_{2} \times (q_{2} - r_{2})} & 0_{p_{2} \times (\tilde{t} - q_{1} - q_{2})} \\ 0_{(q_{2} - r_{2}) \times (q_{1} - r_{1})} & I_{q_{2} - r_{2}} & 0_{(q_{2} - r_{2}) \times (\tilde{t} - q_{1} - q_{2})} \\ 0_{p_{b} \times (q_{1} - r_{1})} & 0_{p_{b} \times (q_{2} - r_{2})} & 0_{p_{b} \times (\tilde{t} - q_{1} - q_{2})} \\ 0_{(\tilde{t} - q_{1} - q_{2}) \times (q_{1} - r_{1})} & 0_{(\tilde{t} - q_{1} - q_{2}) \times (q_{2} - r_{2})} & I_{\tilde{t} - q_{1} - q_{2}} \end{matrix}], & {\tilde{H}}_{⊥} \end{matrix}] \cdot R_{R} (θ_{R, C}), \end{matrix}

where

R_{H} (θ_{H}) \in R^{\tilde{t} \times \tilde{t}}

as above and

R_{R} (θ_{R, C}) \in R^{(s - r) \times (s - r)}

as in Lemma 1. Please note that for all

θ_{H}

,

θ_{R, β}

and

θ_{R, C}

it holds that

β^{r} {(θ_{H}, θ_{R, β})}^{'} C_{1}^{r} (θ_{H}, θ_{R, C}) = 0_{r \times (s - r)}

. The number of parameters restricted under

H_{0}^{''}

is equal to

r_{1} (q_{1} - r_{1}) + r_{2} (q_{2} - r_{2}) + (r_{1} + r_{2}) (\tilde{t} - q_{1} - q_{2}) + (s - r) (s - r + 1) / 2

, and thus, through

q_{1}

and

q_{2}

, depends on the dimension

t_{b}

of the intersection of the columns spaces of

H_{1}

and

H_{2}

. The reduction of the number of free parameters matches the degrees of freedom of the test statistics in Johansen (1995, Theorem 7.5), if

β

is identified, which is the case if

r_{1} \leq {\tilde{t}}_{1}

and

r_{2} \leq {\tilde{t}}_{2}

.

Using the mapping

β^{r} (\cdot)

as a basis for a parameterization allows to introduce another type of hypotheses of the form:

(iv): $H_{0}^{^{'''}} : β_{⊥} = C_{1} = [H_{1} φ_{1}, \dots, H_{c} φ_{c}], β_{⊥} \in R^{s \times (s - r)}, H_{j} \in O_{s, t_{j}}, φ_{j} \in O_{t_{j}, r_{j}}, r_{j} \leq t_{j} \leq s$ , for $j = 1, \dots, c$ such that $\sum_{j = 1}^{c} r_{j} = s - r$ . The ortho-complement of the cointegrating space is contained in the column spaces of the (full rank) matrices $H_{k}$ .

This type of hypothesis allows, e.g., to test for the presence of cross-unit cointegrating relations (cf. Wagner and Hlouskova 2009, Definition 1) in, e.g., multi-country data sets.

Hypotheses on the cointegrating space at frequency

ω_{k} = π

can be treated analogously to hypotheses on the cointegrating space at frequency

ω_{k} = 0

.

Testing hypotheses on cointegrating spaces at frequencies

0 < ω_{k} < π

has to be discussed in more detail, as one also has to consider the space spanned by PCIVs, compare Example 2. There are

2 (s - d_{1}^{k})

linearly independent PCIVs of the form

β (z) = β_{0} + β_{1} z

. Every PCIV corresponds to a vector

z_{k} β_{0} + β_{1} \in C^{s}

orthogonal to

C_{k}

and consequently hypotheses on the space spanned by PCIVs can be transformed to hypotheses on the complex column space of

C_{k} \in C^{s \times d_{1}^{k}}

.

Consider, e.g., an extension of the first type of hypothesis of the form

\begin{matrix} H_{0}^{k} : [\begin{matrix} γ_{0} & {\tilde{γ}}_{0} \\ γ_{1} & {\tilde{γ}}_{1} \end{matrix}] & = & [\begin{matrix} I_{s} & 0_{s \times s} \\ - cos (ω_{k}) I_{s} & sin (ω_{k}) I_{s} \end{matrix}] [\begin{matrix} ({\tilde{H}}_{0} {\tilde{ϕ}}_{0} - {\tilde{H}}_{1} {\tilde{ϕ}}_{1}) & ({\tilde{H}}_{0} {\tilde{ϕ}}_{1} + {\tilde{H}}_{1} {\tilde{ϕ}}_{0}) \\ - ({\tilde{H}}_{0} {\tilde{ϕ}}_{1} + {\tilde{H}}_{1} {\tilde{ϕ}}_{0}) & ({\tilde{H}}_{0} {\tilde{ϕ}}_{0} - {\tilde{H}}_{1} {\tilde{ϕ}}_{1}) \end{matrix}] \\ = & [\begin{matrix} I_{s} & 0_{s \times s} \\ - cos (ω_{k}) I_{s} & sin (ω_{k}) I_{s} \end{matrix}] [\begin{matrix} {\tilde{H}}_{0} & {\tilde{H}}_{1} \\ - {\tilde{H}}_{1} & {\tilde{H}}_{0} \end{matrix}] [\begin{matrix} {\tilde{ϕ}}_{0} & {\tilde{ϕ}}_{1} \\ - {\tilde{ϕ}}_{1} & {\tilde{ϕ}}_{0} \end{matrix}], \end{matrix}

with

{\tilde{H}}_{0}, {\tilde{H}}_{1} \in R^{s \times t}, {\tilde{ϕ}}_{0}, {\tilde{ϕ}}_{1} \in R^{t \times r}

,

r \leq t < s

, which implies that the column space of

C_{k}

is equal to the orthocomplement of the column space of

({\tilde{H}}_{0} + i {\tilde{H}}_{1}) ({\tilde{ϕ}}_{0} + i {\tilde{ϕ}}_{1})

. This general hypothesis encompasses, e.g., the hypothesis

{[γ_{0}^{'}, γ_{1}^{'}]}^{'} = H ϕ = {[H_{0}^{'}, H_{1}^{'}]}^{'} ϕ

, with

H \in R^{2 s \times t}, H_{0}, H_{1} \in R^{s \times t}, ϕ \in R^{t \times r}

, by setting

{\tilde{ϕ}}_{0} : = {\tilde{ϕ}}_{1} : = \tilde{ϕ}

,

{\tilde{H}}_{0} : = H_{0}

and

{\tilde{H}}_{1} : = - (cos (ω_{k}) H_{0} + H_{1}) / sin (ω_{k})

. The extension is tailored to include the pairwise structure of PCIVs and to simplify transformation into hypotheses on the complex matrix

C_{k}

used in the parameterization. The parameterization of the set of matrices corresponding to

H_{0}^{k}

is derived from a mapping of the form given in (15), with

{\overset{ˇ}{R}}_{L} ({\overset{ˇ}{θ}}_{L})

and

R_{R} (θ_{R})

replaced by

{\overset{ˇ}{Q}}_{L} ({\overset{ˇ}{φ}}_{L}) : = \prod_{i = 1}^{t - r} \prod_{j = 1}^{r} Q_{t, i, t - r + j} (φ_{L, r (i - 1) + j}) \in R^{t \times t}

and

D_{d} (φ_{D}) Q_{R} (φ_{R})

as in Lemma 2.

Similarly, the three other types of hypotheses on the cointegrating spaces considered above can be extended to hypotheses on the space of PCIVs in the MFI(1) case. They translate into hypotheses on complex valued matrices

β_{k}

orthogonal to

C_{k}

. To parameterize the set of matrices restricted according to these null hypotheses, Lemma 2 is used. Thus, the restrictions implied by the extensions of all four types of hypotheses to hypotheses on the dynamic cointegrating spaces at frequencies

0 < ω_{k} < π

for MFI(1) processes can be implemented using Givens rotations.

A different case of interest is the hypothesis of at least m linearly independent CIVs

b_{j} \in R^{s}

,

j = 1, \dots, m

with

0 < m \leq s - d_{1}^{k}

, i.e., an m-dimensional static cointegrating space at frequency

0 < ω_{k} < π

, which we discuss as another illustrative example to the procedure for the case of cointegration at complex unit roots.

For the dynamic cointegrating space, this hypothesis implies the existence of

2 m

linearly independent PCIVs of the form

β_{1} (z) = b_{j}

and

β_{2} (z) = b_{j} z

,

j = 1, \dots, m

. In light of the discussion above the necessary condition for these two polynomials to be PCIVs is equivalent to

b_{j}^{'} C_{k} = 0

, for

j = 1, \dots, m

. This restriction is similar to

H_{0}^{'}

discussed above, except for the fact that the cointegrating vectors

b_{j}

are not fully specified. This hypothesis is equivalent to the existence of an m-dimensional real kernel of

C_{k}

. A suitable parameterization is derived from the following mapping

\begin{matrix} C (θ_{b}, φ) & : = & R_{L} (θ_{b}) [\begin{matrix} 0_{m \times d_{1}^{k}} \\ C_{U} (φ) \end{matrix}], \end{matrix}

where

θ_{b} \in {[0, 2 π)}^{m (s - m)}

and

C_{U} (φ) : = C_{U} (φ_{L}, φ_{D}, φ_{R}) \in U_{s - m, d_{1}^{k}}

as in Lemma 2. The difference in the number of free parameters without restrictions and with restrictions is equal to

m (s - m)

.

The hypotheses can also be tested jointly for the cointegrating spaces of several unit roots.

5.1.3. Testing Hypotheses on the Adjustment Coefficients

As in the case of hypotheses on the cointegrating spaces

β_{k}

, hypotheses on the adjustment coefficients

α_{k}

are typically formulated as hypotheses on the column spaces of

α_{k}

. We only focus on hypotheses on the real valued

α_{1}

corresponding to frequency zero. Analogous hypotheses may be considered for

α_{k}

at frequencies

ω_{k} \neq 0

, using the same ideas.

The first type of hypothesis on

α_{1}

is of the form

H_{α} : α_{1} = A ψ, A \in R^{s \times t}, ψ \in R^{t \times r}

and therefore, can be rewritten as

B_{1} A ψ = 0

. W.l.o.g. let

A \in O_{s, t}

and

A_{⊥} \in O_{s, s - t}

. We deal with this type of hypothesis as with

H_{0} : β = H φ

in the previous section by simply reversing the roles of

C_{1}

and

B_{1}

. We, therefore, consider the set of feasible matrices

B_{1}^{'}

as a subset in

O_{s, s - r}

and use the mapping

B_{1}^{'} ({\overset{ˇ}{θ}}_{L}, θ_{R}) = [A {\overset{ˇ}{R}}_{L} {({\overset{ˇ}{θ}}_{L})}^{'} {[I_{t - r}, 0_{r \times (t - r)}]}^{'}, A_{⊥}] R_{R} (θ_{R})

to derive a parameterization, while

C_{1}^{'}

is restricted to be a p.u.t. matrix and the set of feasible matrices

C_{1}^{'}

is parameterized accordingly.

As a second type of hypothesis Juselius (2006, sct. 11.9, p. 200) discusses

H_{α}^{'} : α_{1, ⊥} = H ψ

,

H \in R^{s \times t}, ψ \in R^{t \times (s - r)}

, linked to the absence of permanent effects of shocks

H_{⊥} ε_{t}

on any of the variables of the system. Assume w.l.o.g.

H_{⊥} \in O_{s, s - t}

. Using the parameterization of

O_{s - r} (H_{⊥})

defined in (13) for the set of feasible matrices

B_{1}^{'}

and the parameterization of the set of p.u.t. matrices for the set of feasible matrices

C_{1}^{'}

, implements this restriction.

The restrictions on

H_{α}

reduce the number of free parameters by

r (s - t)

and the restrictions implied by

H_{α}^{'}

lead to a reduction by

t (s - r)

free parameters, compared to the unrestricted case, which matches in both cases the number of degrees of freedom of the corresponding test statistic in the VECM framework.

5.1.4. Restrictions on the Deterministic Components

Including an unrestricted constant in the VECM equation

Δ_{0} y_{t} = ε_{t} + Φ_{0}

leads to a linear trend in the solution process

y_{t} = \sum_{j = 1}^{t} (ε_{j} + Φ_{0}) + y_{1} = \sum_{j = 1}^{t} ε_{j} + y_{1} + Φ_{0} t

, for

t > 1

. If one restricts the constant to

Φ_{0} = α {\tilde{Φ}}_{0}, {\tilde{Φ}}_{0} \in R^{r}

in a general VECM equation as given in (4), with

Π = α β^{'}

of rank r, no summation to linear trends in the solution process occurs, while a constant non-zero mean is still present in the cointegrating relations, i.e., the process

{β^{'} y_{t}}_{t \in Z}

. Analogously an unrestricted linear trend

Φ_{1} t

in the VECM equation leads to a quadratic trend of the form

Φ_{1} t (t - 1) / 2

in the solution process, which is excluded by the restriction

Φ_{1} t = α {\tilde{Φ}}_{1} t

.

In the VECM framework, compare Johansen (1995, sct. 5.7, p. 81), five restrictions related to the coefficients corresponding to the constant and the linear trend are commonly considered:

\begin{matrix} \begin{matrix} 1 . & H (r) & : & Φ d_{t} = Φ_{1} t + Φ_{0}, & i . e ., unrestricted constant and linear trend, \\ 2 . & H^{*} (r) & : & Φ d_{t} = α {\tilde{Φ}}_{1} t + Φ_{0}, & i . e ., unrestricted constant, linear trend restricted to \\ cointegrating relations, \\ 3 . & H_{1} (r) & : & Φ d_{t} = Φ_{0}, & i . e ., unrestricted constant, no linear trend, \\ 4 . & H_{1}^{*} (r) & : & Φ d_{t} = α {\tilde{Φ}}_{0}, & i . e ., constant restricted to cointegrating relations, \\ no linear trend, \\ 5 . & H_{2} (r) & : & Φ d_{t} = 0, & i . e ., no deterministic components present, \end{matrix} \end{matrix}

with

Φ_{0}, Φ_{1} \in R^{s}

and

{\tilde{Φ}}_{0}, {\tilde{Φ}}_{1}, \in R^{r}

and the following consequences for the solution processes: Under

H (r)

the solution process contains a quadratic trend in the direction of the common trends, i.e., in

{β_{⊥}^{'} y_{t}}_{t \in Z}

, and a linear trend in the direction of the cointegrating relations, i.e., in

{β^{'} y_{t}}_{t \in Z}

. Under

H^{*} (r)

the quadratic trend is not present.

H_{1} (r)

features a linear trend only in the directions of the common trends,

H_{2} (r)

a constant only in these directions. Under

H_{1}^{*} (r)

the constant is also present in the directions of the cointegrating relations.

In the state space framework the deterministic components can be added in the output equation

y_{t} = C x_{t} + Φ d_{t} + ε_{t}

, compare (9). Consequently, the above considered hypotheses can be imposed by formulating linear restrictions on

Φ

. These can be directly parameterized by including the following deterministic components in the five considered cases:

\begin{matrix} \begin{matrix} 1 . & H (r) & : & Φ d_{t} = C_{1} {\tilde{Φ}}_{2} t^{2} + Φ_{1} t + Φ_{0}, \\ 2 . & H^{*} (r) & : & Φ d_{t} = Φ_{1} t + Φ_{0}, \\ 3 . & H_{1} (r) & : & Φ d_{t} = C_{1} {\tilde{Φ}}_{1} t + Φ_{0}, \\ 4 . & H_{1}^{*} (r) & : & Φ d_{t} = Φ_{0}, \\ 5 . & H_{2} (r) & : & Φ d_{t} = C_{1} {\tilde{Φ}}_{0}, \end{matrix} \end{matrix}

where

Φ_{0}, Φ_{1} \in R^{s}

and

{\tilde{Φ}}_{0}, {\tilde{Φ}}_{1}, {\tilde{Φ}}_{2} \in R^{d_{1}^{1}}

. The component

C_{1} {\tilde{Φ}}_{0}

captures the influence of the initial value

C_{1} x_{1, 1}

in the output equation.

In the VECM framework for the seasonal MFI(1) case, with

Π_{k} = α_{k} β_{k}^{'}

of rank

r_{k}

for

0 < ω_{k} < π

, the deterministic component usually includes restricted seasonal dummies of the form

α_{k} {\tilde{Φ}}_{k} z_{k}^{t} + \bar{α_{k} {\tilde{Φ}}_{k} {(z_{k})}^{t}}

,

{\tilde{Φ}}_{k} \in C^{r_{k}}

to avoid summation in the directions of the stochastic trends. The state space framework allows to straightforwardly include seasonal dummies in the output equation in the form of

Φ_{k} z_{k}^{t} + \bar{Φ_{k} {(z_{k})}^{t}}

,

Φ_{k} \in C^{s}

. Again, it is of interest whether these components are unrestricted or whether they take the form of

C_{k} {\tilde{Φ}}_{k} z_{k}^{t} + \bar{C_{k} {\tilde{Φ}}_{k} {(z_{k})}^{t}}

,

{\tilde{Φ}}_{k} \in C^{d_{1}^{k}}

, similarly allowing for a reinterpretation of these components as influence of the initial values

x_{1, k}

on the output.

Please note that

Φ_{k} z_{k}^{t} + \bar{Φ_{k} {(z_{k})}^{t}}

is equivalently given by

{\overset{ˇ}{Φ}}_{k, 1} sin (ω_{k} t) + {\overset{ˇ}{Φ}}_{k, 2} cos (ω_{k} t)

using real coefficients

{\overset{ˇ}{Φ}}_{k, 1}, {\overset{ˇ}{Φ}}_{k, 2} \in R^{s}

and the desired restrictions can be implemented accordingly.

5.2. The $I (2)$ Case

The state space unit root structure of I(2) processes is of the form

Ω_{S} = ((0, d_{1}^{1}, d_{2}^{1}))

, where the integer

d_{1}^{1}

equals the dimension of

x_{t, 1}^{E}

, and

d_{2}^{1}

equals the dimension of

{[{(x_{t, 2}^{G})}^{'}, {(x_{t, 2}^{E})}^{'}]}^{'}

. Recall that the solution for

t > 0

and

x_{1, u} = 0

of the system in canonical form in this setting is given by

\begin{matrix} y_{t} & = & C_{1, 1}^{E} x_{t, 1}^{E} + C_{1, 2}^{G} x_{t, 2}^{G} + C_{1, 2}^{E} x_{t, 2}^{E} + C_{•} x_{t, •} + Φ d_{t} + ε_{t} \\ = & C_{1, 1}^{E} B_{1, 2, 1} \sum_{k = 1}^{t - 1} \sum_{j = 1}^{k} ε_{t - j} + (C_{1, 1}^{E} B_{1, 1} + C_{1, 2}^{G} B_{1, 2, 1} + C_{1, 2}^{E} B_{1, 2, 2}) \sum_{j = 1}^{t - 1} ε_{t - j} \\ + C_{•} \sum_{j = 1}^{t - 1} A_{•}^{j - 1} B_{•} ε_{t - j} + C_{•} A_{•}^{t - 1} x_{1, •} + Φ d_{t} + ε_{t} . \end{matrix}

For VAR processes integrated of order two the integers

d_{1}^{1}

and

d_{2}^{1}

of the corresponding state space unit root structure are linked to the ranks of the matrices

Π = α β^{'}

(denoted as

r = r_{0}

) and

α_{⊥}^{'} Γ β_{⊥} = ξ η^{'}

(denoted as

m = r_{1}

) in the VECM setting, as discussed in Section 2. It holds that

r = s - d_{2}^{1}

and

m = d_{2}^{1} - d_{1}^{1}

. The relation of the state space unit root structure to the cointegration indices

r_{0}, r_{1}, r_{2}

was also discussed in Section 3.

Again, both the integers

d_{1}^{1}

and

d_{2}^{1}

and the ranks r and m, and consequently also the indices

r_{0}, r_{1}

and

r_{2}

, are closely related to the dimensions of the spaces spanned by CIVs and PCIVs. In the

I (2)

case the static cointegrating space of order

((0, 2), (0, 1))

is the orthocomplement of the column space of

C_{1, 1}^{E}

and thus of dimension

s - d_{1}^{1}

. The dimension of the space spanned by CIVs of order

((0, 2), {})

is equal to

s - d_{2}^{1} - r_{c, G}

, where

r_{c, G}

denotes the rank of

C_{1, 2}^{G}

, since this space is the orthocomplement of the column space of

[C_{1, 1}^{E}, C_{1, 2}^{G}, C_{1, 2}^{E}]

. The space spanned by the PCIVs

β_{0} + β_{1} z

of order

((0, 2), {})

is of dimension smaller or equal to

2 s - d_{1}^{1} - d_{2}^{1}

, due to the orthogonality constraint on

{[β_{0}^{'}, β_{1}^{'}]}^{'}

given in Example 3.

Consider the matrices

β

,

β_{1}

and

β_{2}

as defined in Section 2. From a state space realization

(A, B, C)

in canonical form corresponding to a VAR process it immediately follows that the columns of

β_{2}

span the same space as the columns of the sub-block

C_{1, 1}^{E}

. The same relation holds true for

β_{1}

and the sub-block

C_{1, 2}^{E}

. With respect to polynomial cointegration, Bauer and Wagner (2012) show that the rank of

C_{1, 2}^{G}

determines the number of minimum degree polynomial cointegrating relations, as discussed in Example 3. If

C_{1, 2}^{G} = 0

, then there exists no vector

γ

, such that

{γ^{'} y_{t}}_{t \in Z}

is integrated and cointegrated with

{β_{2}^{'} Δ_{0} y_{t}}_{t \in Z}

. In this case

{β^{'} y_{t}}_{t \in Z}

is a stationary process.

The deterministic components included in the I(2) setting are typically a constant and a linear trend. As in the MFI(1) case, identifiability problems occur, if we consider a non-zero initial state

x_{1, u}

: The solution to the state space equations for

t > 0

and

x_{1, u} \neq 0

is given by:

\begin{matrix} y_{t} & = & \sum_{j = 1}^{t - 1} C A^{j - 1} B ε_{t - j} + C_{1, 1}^{E} (x_{1, 1}^{E} + x_{1, 2}^{G} (t - 1)) + C_{1, 2}^{G} x_{1, 2}^{G} + C_{1, 2}^{E} x_{1, 2}^{E} + C_{•} A_{•}^{t - 1} x_{1, •} + Φ d_{t} + ε_{t} . \end{matrix}

Hence, if

Φ d_{t} = Φ_{0} + Φ_{1} t

, the output equation contains the terms

C_{1, 1}^{E} x_{1, 1}^{E} + C_{1, 2}^{G} x_{1, 2}^{G} + C_{1, 2}^{E} x_{1, 2}^{E} - C_{1, 1}^{E} x_{1, 2}^{G} + Φ_{0}

and

(C_{1, 1}^{E} x_{1, 2}^{G} + Φ_{1}) t

. Again, this implies non-identifiability, which is resolved by assuming

x_{1, u} = 0

, compare Remark 6.

5.2.1. Testing Hypotheses on the State Space Unit Root Structure

To simplify notation we use

\begin{matrix} \bar{M} (d_{1}^{1}, d_{2}^{1}) & : = & \{\begin{matrix} \bar{M (((0, d_{1}^{1}, d_{2}^{1})), n - d_{1}^{1} - d_{2}^{1})} & if d_{1}^{1} > 0, \\ \bar{M (((0, d_{2}^{1})), n - d_{2}^{1})} & if d_{1}^{1} = 0, d_{2}^{1} > 0, \\ \bar{M_{•, n}} & if d_{1}^{1} = d_{2}^{1} = 0, \end{matrix} \end{matrix}

with

n \geq d_{1}^{1} + d_{2}^{1}

. Here

\bar{M} (d_{1}^{1}, d_{2}^{1})

for

d_{1}^{1} + d_{2}^{1} > 0

denotes the closure of the set of transfer functions of order n that possess a state space unit root structure of either

Ω_{S} = ((0, d_{1}^{1}, d_{2}^{1}))

or

Ω_{S} = ((0, d_{2}^{1}))

in case of

d_{1}^{1} = 0

, while

\bar{M} (0, 0)

denotes the closure of the set of all stable transfer functions of order n.

Considering the relations between the different sets of transfer functions given in Corollary 4 shows that the following relations hold (assuming

s \geq 4

; the columns are arranged to include transfer functions with the same dimension of

A_{u}

):

\begin{matrix} \begin{matrix} \bar{M} (0, 0) & \supset & \bar{M} (0, 1) & \supset & \bar{M} (1, 0) \\ \cup \\ \bar{M} (0, 2) & \supset & \bar{M} (1, 1) & \supset & \bar{M} (2, 0) \\ \cup & \cup \\ \bar{M} (0, 3) & \supset & \bar{M} (1, 2) \\ \cup \\ \bar{M} (0, 4) \end{matrix} \end{matrix}

Please note that

\bar{M} (d_{1}^{1}, d_{2}^{1})

corresponds to

H_{s - d_{2}^{1}, d_{2}^{1} - d_{1}^{1}} = H_{r, r_{1}}

in Johansen (1995). Therefore, the relationships between the subsets match the ones in Johansen (1995, Table 9.1) and the ones found by Jensen (2013). The latter type of inclusions appear for instance for

\bar{M} (0, 2)

, containing transfer functions corresponding to

I (1)

processes, which is a subset of the set

\bar{M} (1, 0)

of transfer functions corresponding to

I (2)

processes.

The same remarks as in the MFI(1) case also apply in the I(2) case: When testing for

H_{0} : Ω_{S} = ((0, d_{1, 0}^{1}, d_{2, 0}^{1}))

, all attainable state space unit root structures

A (((0, d_{1, 0}^{1}, d_{2, 0}^{1})))

have to be included in the null hypothesis.

5.2.2. Testing Hypotheses on CIVs and PCIVs

Johansen (2006) discusses several types of hypotheses on the cointegrating spaces of different orders. These deal with properties of

β

, joint properties of

[β, β_{1}]

or the occurrence of non-trivial polynomial cointegrating relations. Boswijk and Paruolo (2017), moreover, discuss testing hypotheses on the loading matrices of common trends (corresponding in our setting to testing hypotheses on

C_{1}

).

We commence with hypotheses of the form

H_{0} : β = K φ

and

H_{0}^{'} : β = [b, φ]

just as in the MFI(1) case at unit root one, since hypotheses on

β

correspond to hypotheses on its orthocomplement spanned by

[C_{1, 1}^{E}, C_{1, 2}^{E}]

in the VARMA framework:

Hypotheses of the form

H_{0} : β = K φ, K \in R^{s \times t}, φ \in R^{t \times r}

imply

φ^{'} K^{'} [C_{1, 1}^{E}, C_{1, 2}^{E}] = 0

. W.l.o.g. let

K \in O_{s, t}

and

K_{⊥} \in O_{s, s - t}

. As in the parameterization under

H_{0}

in the MFI(1) case at unit root one, compare (15), use the mapping

\begin{matrix} [C_{1, 1}^{E, r}, C_{1, 2}^{E, r}] ({\overset{ˇ}{θ}}_{L}, θ_{R}) & : = & [\begin{matrix} K \cdot {\overset{ˇ}{R}}_{L} {({\overset{ˇ}{θ}}_{L})}^{'} [\begin{matrix} I_{t - r} \\ 0_{r \times (t - r)} \end{matrix}], & K_{⊥} \end{matrix}] \cdot R_{R} (θ_{R}), \end{matrix}

to derive a parameterization of the set of feasible matrices

[C_{1, 1}^{E}, C_{1, 2}^{E}]

, i.e., a joint parameterization of both sets of matrices

C_{1, 1}^{E}

and

C_{1, 2}^{E}

, where

[C_{1, 1}^{E}, C_{1, 2}^{E}] \in O_{s, s - r}

.

Hypotheses of the form

H_{0}^{'} : β = [b, φ], b \in R^{s \times t}, φ \in R^{s \times (r - t)}, 0 < t \leq r

are equivalent to

b^{'} [C_{1, 1}^{E}, C_{1, 2}^{E}] = 0

. Assume w.l.o.g.

b \in O_{s, t}

and parameterize the set of feasible matrices

C_{1, 1}^{E}

using

O_{s, d_{1}^{1}} (b)

as defined in (13) and the set of feasible matrices

C_{1, 2}^{E}

using

O_{s, d_{2}^{1} - d_{1}^{1}} ([b, C_{1, 1}^{E}])

. Alternatively, parameterize the set of feasible matrices jointly as elements

[C_{1, 1}^{E}, C_{1, 2}^{E}] \in O_{s, s - r} (b)

.

Applications using the VECM framework allow for testing hypotheses on

[β, β_{1}]

. In the VARMA framework, these correspond to hypotheses on the orthogonal complement of

[β, β_{1}]

, i.e.,

C_{1, 1}^{E}

. Implementation of different types of hypotheses on

[β, β_{1}]

proceeds as for similar hypotheses on

β

in the MFI(1) case at unit root one, replacing

[C_{1, 1}^{E}, C_{1, 2}^{E}]

by

C_{1, 1}^{E}

.

The hypothesis of no minimum degree polynomial cointegrating relations implies the restriction

C_{1, 2}^{G} = 0

, compare Example 3. Therefore, we can test all hypotheses considered in Johansen (2006) also in our more general setting.

5.2.3. Testing Hypotheses on the Adjustment Coefficients

Hypotheses on

α

and

ξ

as defined in (6) and (7) correspond to hypotheses on the spaces spanned by the rows of

B_{1, 2, 1}

and

B_{1, 2, 2}

. For VAR processes integrated of order two, the row space of

B_{1, 2, 1}

is equal to the orthogonal complement of the column space of

[α, α_{⊥} ξ]

, while the row space of

B_{1, 2} : = {[B_{1, 2, 1}^{'}, B_{1, 2, 2}^{'}]}^{'}

is equal to the orthogonal complement of the column space of

α

. The restrictions corresponding to hypotheses on

α

and

ξ

can be implemented analogously to the restrictions corresponding to hypotheses on

α_{1}

in Section 5.1.3, reversing the roles of the relevant sub-blocks in

B_{u}

and

C_{u}

accordingly.

5.2.4. Restrictions on the Deterministic Components

The I(2) case is, with respect to the modeling of deterministic components, less well studied than the MFI(1) case. In most theory papers they are simply left out, with the notable exception Rahbek et al. (1999), dealing with the inclusion of a constant term in the I(2)-VECM representation. The main reason for this appears to be the way deterministic components in the defining vector error correction representation translate into deterministic components in the corresponding solution process. An unrestricted constant in the VECM for I(2) processes leads to a linear trend in

{β_{1}^{'} y_{t}}_{t \in Z}

and a quadratic trend in

{β_{2}^{'} y_{t}}_{t \in Z}

, while an unrestricted linear trend results in quadratic and cubic trends in the respective directions. Already in the I(1) case discussed above five different cases—with respect to integration and asymptotic behavior of estimators and tests—need to be considered separately. An all encompassing discussion of the restrictions on the coefficients of a constant and a linear trend in the I(2) case requires the specification of even more cases. As an alternative approach in the VECM framework, deterministic components could be dealt with by replacing

y_{t}

with

y_{t} - Φ d_{t}

in the VECM equation. This has recently been considered in Johansen and Nielsen (2018) and is analogous to our approach in the state space framework.

As before, in the MFI(1) or I(1) case, the analysis of (the impact of) deterministic components is straightforward in the state space framework, which effectively stems from their additive inclusion in the Granger-type representation, compare (9). Choose, e.g.,

Φ d_{t} = Φ_{0} + Φ_{1} t

, as in the I(1) case. In analogy to Section 5.1.4, linear restrictions of deterministic components in relation to the static and polynomial cointegrating spaces can be embedded in a parameterization. Focusing on

Φ_{0}

, e.g., this is achieved by

\begin{matrix} Φ_{0} & = & [C_{1, 1}^{E}, C_{1, 2}^{E}] ϕ_{0} + {\tilde{C}}_{1, 2} {\tilde{ϕ}}_{0} + C_{⊥} {\overset{ˇ}{ϕ}}_{0}, \end{matrix}

where the columns of

{\tilde{C}}_{1, 2}

are a basis for the column space of

C_{1, 2}^{G}

, which does not necessarily have full column rank, and the columns of

C_{⊥}

span the orthocomplement of the column space of

[C_{1, 1}^{E}, C_{1, 2}^{E}, {\tilde{C}}_{1, 2}]

. The matrix

Φ_{1}

can be decomposed analogously. The corresponding parametrization then allows to consider different restricted versions of deterministic components and to study the asymptotic behavior of estimators and tests for these cases.

6. Summary and Conclusions

Vector autoregressive moving average (VARMA) processes, which can be cast equivalently in the state space framework, may be useful for empirical analysis compared to the more restrictive class of vector autoregressive (VAR) processes for a variety of reasons. These include invariance with respect to marginalization and aggregation, parsimony as well as the fact that the log-linearized solutions to DSGE models are typically VARMA processes rather than VAR processes. To realize the potential of these advantages necessitates, in our view, to develop cointegration analysis for VARMA processes to a similar extent as it is developed for VAR processes. The necessary first steps of this research agenda are to develop a set of structure theoretical results that allow subsequently developing statistical inference procedures. Bauer and Wagner (2012) provides the very first step of this agenda by providing a canonical form for unit root processes in the state space framework, which is shown in that paper to be very convenient for cointegration analysis.

Based on the earlier canonical form paper this paper derives a state space model parameterization for VARMA processes with unit roots using the state space framework. The canonical form and a fortiori the parameterization based on it are constructed to facilitate the investigation of the unit root and (static and polynomial) cointegration properties of the considered process. Furthermore, the paper shows that the framework allows to test a large variety of hypotheses on cointegrating ranks and spaces, clearly a key aspect for the usefulness of any method to analyze cointegration. In addition to providing general results, throughout the paper all results are discussed in detail for the multiple frequency I(1) and I(2) cases, which cover the vast majority of applications.

Given the fact that (as shown in Hazewinkel and Kalman 1976) VARMA unit root processes cannot be continuously parameterized, the set of all unit root processes (as defined in this paper) is partitioned according to a multi-index

Γ

that includes the state space unit root structure. The parameterization is shown to be a diffeomorphism on the interior of the considered sets. The topological relationships between the sets forming the partitioning of all transfer functions considered are studied in great detail for three reasons: First, pseudo maximum likelihood estimation effectively amounts to maximizing the pseudo likelihood function over the closures of sets of transfer functions,

{\bar{M}}_{Γ}

in our notation. Second, related to the first item, the relations between subsets of

M_{Γ}

have to be understood in detail as knowledge concerning these relations is required for developing (sequential) pseudo likelihood-ratio tests for the numbers of stochastic trends or cycles. Third, of particular importance for the implementation of, e.g., pseudo maximum likelihood estimators, we discuss the existence of generic pieces.

In this respect we derive two results: First, for correctly specified state space unit root structure and system order of the stable subsystem —and thus correctly specified system order—we explicitly describe generic indices

Γ_{g} (Ω_{S}, n_{•})

such that

M_{Γ_{g} (Ω_{S}, n_{•})}

is open and dense in the set of all transfer functions with state space unit root structure

Ω_{S}

and system order of the stable subsystem

n_{•}

. This result forms the basis for establishing consistent estimators of the transfer functions—and via continuity of the parameterization—of the parameter estimators when the state space unit root structure and system order are known. Second, in case only an upper bound on the system order is known (or specified), we show the existence of a generic multi-index

Γ_{α_{•, g} (n)}

for which the set of corresponding transfer functions

M_{Γ_{α_{•, g} (n)}}

is open and dense in the set

{\bar{M}}_{n}

of all non-explosive transfer functions whose order (or McMillan degree) is bounded by n. This result is the basis for consistent estimation (on an open and dense subset) when only an upper bound of the system order is known. In turn this estimator is the starting point for determining

Ω_{S}

, using the subset relationships alluded to above in the second point. For the MFI(1) and I(2) cases we show in detail that similar subset relations (concerning cointegrating ranks) as in the cointegrated VAR MFI(1) and I(2) cases hold, which suggests constructing similar sequential test procedures for determining the cointegrating ranks as in the VAR cointegration literature.

Section 5 is devoted to a detailed discussion of testing hypotheses on the cointegrating spaces, again for both the MFI(1) and the I(2) case. In this section, particular emphasis is put on modeling deterministic components. The discussion details how all usually formulated and tested hypotheses concerning (static and polynomial) cointegrating vectors, potentially in combination with (un-)restricted deterministic components, in the VAR framework can also be investigated in the state space framework.

Altogether, the paper sets the stage to develop pseudo maximum likelihood estimators, investigate their asymptotic properties (consistency and limiting distributions) and tests based on them for determining cointegrating ranks that allow performing cointegration analysis for cointegrated VARMA processes. The detailed discussion of the MFI(1) and I(2) cases benefits the development of statistical theory dealing with these cases undertaken in a series of companion papers.

Author Contributions

The authors of the paper have contributed equally, via joint efforts, regarding both ideas, research, and writing. Conceptualization, all authors; methodology, all authors; formal analysis, P.d.M.R. and L.M.; investigation, all authors; writing—original draft preparation, P.d.M.R. and L.M.; writing—review and editing, all authors.; project administration, D.B. and M.W.; funding acquisition, D.B. and M.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation-Projektnummer 276051388) which is gratefully acknowledged. We acknowledge support for the publication costs by the Deutsche Forschungsgemeinschaft and the Open Access Publication Fund of Bielefeld University.

Acknowledgments

We thank the editors, Rocco Mosconi and Paolo Paruolo, as well as anonymous referees for helpful suggestions. The views expressed in this paper are solely those of the authors and not necessarily those of the Bank of Slovenia or the European System of Central Banks. On top of this the usual disclaimer applies.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proofs of the Results of Section 3

Appendix A.1. Proof of Lemma 1

(i): Let $C_{j}$ be a sequence in $O_{s, d}$ converging to $C_{0}$ for $j \to \infty$ . By continuity of matrix multiplication

$\begin{matrix} C_{0}^{'} C_{0} = {(lim_{j \to \infty} C_{j})}^{'} lim_{j \to \infty} C_{j} = lim_{j \to \infty} (C_{j}^{'} C_{j}) = I_{d} . \end{matrix}$

Thus, $C_{0} \in O_{s, d}$ , which shows that $O_{s, d}$ is closed. By construction ${[C^{'} C]}_{i, i} = \sum_{j = 1}^{s} c_{j, i}^{2}$ . Since ${[C^{'} C]}_{i, i} = 1$ for all $C \in O_{s, d}$ and $i = 1, \dots, d$ , the entries of C are bounded.
(ii): By definition $C_{O} (θ)$ is a product of matrices whose elements are either constant or infinitely often differentiable functions of the elements of $θ$ .
(iii): The algorithm discussed above Lemma 1 maps every $C \in O_{s, d}$ to ${[I_{d}, 0_{s - d \times d}^{'}]}^{'}$ . Since $R_{q, i, j} {(θ)}^{- 1} = R_{q, i, j} {(θ)}^{'}$ for all $q, i, j$ and $θ$ , C can be obtained by multiplying ${[I_{d}, 0_{s - d \times d}^{'}]}^{'}$ with the transposed Givens rotations.
(iv): As discussed, $C_{O}^{- 1} (\cdot)$ is obtained from a repeated application of the algorithm described in Remark 10. In each step two entries are transformed to polar coordinates. According to Amann and Escher (2008, chp. 8, p. 204) the transformation to polar coordinates is infinitely often differentiable with infinitely often differentiable inverse for $θ > 0$ (and hence $r > 0$ ), i.e., on the interior of the interval $[0, π)$ . Thus, $C_{O}^{- 1}$ is a concatenation of functions which are infinitely often differentiable on the interior of $Θ_{O}^{R}$ and is thus infinitely often differentiable, if $θ_{j} > 0$ for all components of $θ$ .
Clearly, the interior of $Θ_{O}^{R}$ is open and dense in $Θ_{O}^{R}$ . By the definition of continuity the pre-image of the interior of $Θ_{O}^{R}$ is open in $O_{s, d}$ . By (iii) there exists a $θ_{0}$ for arbitrary $C_{0} \in O_{s, d}$ such that $C_{O} (θ_{0}) = C_{0}$ . Since the interior of $Θ_{O}^{R}$ is dense in $Θ_{O}^{R}$ there exists a sequence $θ_{j}$ in the interior of $Θ_{O}^{R}$ such that $θ_{j} \to θ_{0}$ . Then $C_{O} (θ_{j}) \to C_{0}$ because of the continuity of $C_{O}$ . Since $C_{O} (θ_{j})$ is a sequence in the pre-image of the interior of $Θ_{O}^{R}$ , it follows that the pre-image of the interior of $Θ_{O}^{R}$ is dense in $O_{s, d}$ .
(v): For any $C \in O_{s, s}$ it holds that $1 = det (C^{'} C) = det {(C)}^{2}$ and $det (C) \in R$ , which implies $det (C) \in {- 1, 1}$ . Since the determinant is a continuous function on square matrices, both sets $O_{s, s}^{+}$ and $O_{s, s}^{-}$ are disjoint and closed.
(vi): The proof proceeds analogously to the proof of (iii).
(vii): A function defined on two disjoint subsets is infinitely often differentiable if and only if the two functions restricted to the subsets are infinitely often differentiable. The same arguments as used in (iv) together with the results in (ii) imply that $C_{O}^{- 1} : O_{s, s}^{+} \to Θ_{O}^{R}$ and $C_{O}^{\pm} (\cdot) |_{O_{s, s}^{+}}$ are infinitely often differentiable with infinitely often differentiable inverse on an open subset of $O_{s, s}^{+}$ . Clearly, the multiplication with $I_{s}^{-}$ is infinitely often differentiable with infinitely often differentiable inverse, which implies that $C_{O}^{\pm} (\cdot) |_{O_{s, s}^{-}}$ is infinitely often differentiable with infinitely often differentiable inverse on an open subset of $O_{s, s}^{-}$ , from which the result follows.

Appendix A.2. Proof of Lemma 2

(i): Let $C_{j}$ be a sequence in $U_{s, d}$ converging to $C_{0}$ for $j \to \infty$ . By continuity of matrix multiplication

$\begin{matrix} C_{0}^{'} C_{0} = {(lim_{j \to \infty} C_{j})}^{'} lim_{j \to \infty} C_{j} = lim_{j \to \infty} (C_{j}^{'} C_{j}) = I_{d} . \end{matrix}$

Thus, $C_{0} \in U_{s, d}$ , which shows that $U_{s, d}$ is closed. By construction ${[C^{'} C]}_{i, i} = \sum_{j = 1}^{s} {| c_{j, i} |}^{2}$ . Since ${[C^{'} C]}_{i, i} = 1$ for all $C \in U_{s, d}$ and $i = 1, \dots, d$ , the entries of C are bounded.
(ii): By definition $C_{U} (φ)$ is a product of matrices whose elements are either constant or infinitely often differentiable functions of the elements of $φ$ .
(iii): The algorithm discussed above Lemma 2 maps every $C \in U_{s, d}$ to ${[D_{d} (φ_{D}), 0_{s - d \times d}^{'}]}^{'}$ with $D_{d} (φ_{D}) = diag (e^{i φ_{D, 1}}, \dots, e^{i φ_{D, d}})$ . Since $Q_{q, i, j} {(φ)}^{- 1} = Q_{q, i, j} {(φ)}^{'}$ for all $q, i, j$ and $φ$ , C can be obtained by multiplying ${[D_{d} (φ_{D}), 0_{s - d \times d}^{'}]}^{'}$ with the transposed Givens rotations.
(iv): The algorithms in Remark 12 and above Lemma 2 describe $C_{U}^{- 1}$ in detail. The determination of an element of $φ_{L}$ or $φ_{R}$ uses the transformation of two complex numbers into polar coordinates in step 2 of Remark 12, which according to Amann and Escher (2008, chp. 8, p. 204) is infinitely often differentiable with infinitely often differentiable inverse except for non-negative reals, which are the complement of an open and dense subset of the complex plane. Step 3 of Remark 12 uses the formulas $φ_{1} = \tan^{- 1} (\frac{b}{a})$ , which is infinitely often differentiable for $a > 0$ , and $φ_{2} = φ_{a} - φ_{b} \mod 2 π$ , which is infinitely often differentiable for $φ_{a} \neq φ_{b}$ , which occurs on an open and dense subset of $[0, 2 π) \times [0, 2 π)$ . For the determination of an element of $φ_{D}$ a complex number of modulus one is transformed in polar coordinates which is infinitely often differentiable on an open and dense subset of complex numbers of modulus one compare again Amann and Escher (2008, chp. 8, p. 204). Thus, $C_{U}^{- 1}$ is a concatenation of functions which are infinitely often differentiable on open and dense subsets of their domain of definition and is thus infinitely often differentiable on an open and dense subset of $U_{s, d}$ .

Appendix A.3. Proof of Theorem 2

(i): The multi-index $Γ$ is unique for a transfer function $k \in M_{n}$ , since it only contains information encoded in the canonical form. Therefore, $M_{Γ}$ is well defined. Since conversely for every transfer function $k \in M_{n}$ a multi-index $Γ$ can be found, $M_{Γ}$ constitutes a partitioning of $M_{n}$ . Furthermore, using the canonical form, it is straightforward to see that the mapping attaching the triple $(A, B, C) \in Δ_{Γ}$ in canonical form to a transfer function $k \in M_{Γ}$ is homeomorphic (bijective, continuous, with continuous inverse): Bijectivity is a consequence of the definition of the canonical form. $T_{p t}$ continuity of the transfer function as a function of the matrix triples is obvious from the definition of $T_{p t}$ . Continuity of the inverse can be shown by constructing the canonical form starting with an overlapping echelon form (which is continuous according to Hannan and Deistler 1988, chp. 2) and subsequently transforming the state basis to reach the canonical form. This involves the calculation of a Jordan normal form with fixed structure. This is an analytic mapping (cf. Chatelin 1993, Theorem 4.4.3). Finally, the restrictions on C and B are imposed. For given multi-index $Γ$ these transformations are continuous (as discussed above they involve QR decompositions to obtain unitary block columns for the blocks of C, rotations to p.u.t form with fixed structure for the blocks of B and transformations to echelon canonical form for the stable part).
(ii): The construction of the triple $(A (θ), B (θ), C (θ))$ for given $θ$ and $Γ$ is straightforward: $A_{u}$ is uniquely determined by $Γ$ . Since $θ_{B, p}$ contains the entries of $B_{u}$ restricted to be positive and $θ_{B, f}$ contains the free parameters of $B_{u}$ , the mapping $θ_{B, p} \times θ_{B, f} \to B_{u}$ is continuous. The mapping $θ_{•} \to (A_{•}, B_{•}, C_{•})$ is continuous (cf. Hannan and Deistler 1988, Theorem 2.5.3 (ii). The mapping $θ_{C, E} \times θ_{C, G} \to C_{u}$ consists of iterated applications of $C_{O},$ and $C_{U}$ (compare Lemmas 1 and 2) which are differentiable and thus continuous and iterated applications of the extensions of the mappings $C_{O, d_{2} - d_{1}}$ and $C_{O, G}$ (compare Corollaries 1 and 2) to general unit root structures and to complex matrices. The proof that these functions are differentiable is analogous to the proofs of Lemma 1 and Lemma 2.
(iii): The definitions of $θ_{B, f}$ and $θ_{B, p}$ immediately imply that they depend continuously on $B_{u}$ . The parameter vector $θ_{•}$ depends continuously on $(A_{•}, B_{•}, C_{•})$ (cf. Hannan and Deistler 1988, Theorem 2.5.3 (ii)). The existence of an open and dense subset of matrices $C_{u}$ such that the mapping attaching parameters to the matrices is continuous follows from arguments contained in the proofs of Lemmas 1 and 2.

Appendix B. Proofs of the Results of Section 4

Appendix B.1. Proof of Theorem 3

For the first inclusion the proof can be divided into two parts, discussing the stable and the unstable subsystem separately. The result with regard to the stable subsystem is due to Hannan and Deistler (1988, Theorem 2.5.3 (iv)). For the unstable subsystem

({\tilde{Ω}}_{S}, \tilde{p}) \leq (Ω_{S}, p)

implies the existence of a matrix S as described in Definition 9. Partition

S = [\begin{matrix} \begin{matrix} S_{1} \\ S_{2} \end{matrix} \end{matrix}]

such that

S_{1} p = p_{1} \geq \tilde{p}

. Let

\tilde{k}

be an arbitrary transfer function in

M_{\tilde{Γ}} = π (Δ_{\tilde{Γ}})

with corresponding state space realization

(\tilde{A}, \tilde{B}, \tilde{C}) \in Δ_{\tilde{Γ}}

. Then, we find matrices

B_{1}

and

C_{1}

such that for the state space realization given by

A = S \begin{matrix} [\begin{matrix} \tilde{A} & {\tilde{J}}_{12} \\ 0 & {\tilde{J}}_{2} \end{matrix}] \end{matrix} S^{'}

,

B = S \begin{matrix} [\begin{matrix} \tilde{B} \\ B_{1} \end{matrix}] \end{matrix}

and

C = \begin{matrix} [\begin{matrix} \tilde{C} & C_{1} \end{matrix}] \end{matrix} S^{'}

it holds that

(A, B, C) \in Δ_{Γ}

. Then,

(A_{j}, B_{j}, C_{j}) = (A, S diag (I_{n_{1}}, j^{- 1} I_{n_{2}}) S^{'} B, C) \in Δ_{Γ}

, where

n_{i}

is the number of rows of

S_{i}

for

i = 1, 2

converges for

j \to \infty

to

(A, S \begin{matrix} [\begin{matrix} \tilde{B} \\ 0 \end{matrix}] \end{matrix}, C) \in \bar{Δ_{Γ}}

, which is observationally equivalent to

(\tilde{A}, \tilde{B}, \tilde{C})

. Consequently,

\tilde{k} = π (A, S \begin{matrix} [\begin{matrix} \tilde{B} \\ 0 \end{matrix}] \end{matrix}, C) \in π (\bar{Δ_{Γ}})

.

To show the second inclusion, consider a sequence of systems

(A_{j}, B_{j}, C_{j}) \in Δ_{Γ}, j \in N

converging to

(A_{0}, B_{0}, C_{0}) \in \bar{Δ_{Γ}}

. We need to show

\bar{Γ} \in ⋃_{\tilde{Γ} \in K (Γ)} {\overset{ˇ}{Γ} \leq \tilde{Γ}}

, where

\bar{Γ}

is the multi-index corresponding to

(A_{0}, B_{0}, C_{0})

.

For the stable system we can separate the subsystem

(A_{j, s}, B_{j, s}, C_{j, s})

remaining stable in the limit and the part with eigenvalues of

A_{j}

tending to the unit circle. As discussed in Section 4.1.2,

(A_{j, s}, B_{j, s}, C_{j, s})

converges to the stable subsystem

(A_{0, •}, B_{0, •}, C_{0, •})

whose Kronecker indices can only be smaller than or equal to

α_{•}

(cf. Hannan and Deistler 1988, Theorem 2.5.3).

The remaining subsystem consists of the unstable subsystem of

(A_{j}, B_{j}, C_{j})

which converges to

(A_{0, u}, B_{0, u}, C_{0, u})

and the second part of the stable subsystem containing all stable eigenvalues of

A_{j}

converging to the unit circle. The limiting combined subsystem

(A_{0, c}, B_{0, c}, C_{0, c})

is such that

A_{0, c}

is block diagonal. If the limiting combined subsystem is minimal and

B_{0, u}

has a structure corresponding to p, this shows that the pair

({\bar{Ω}}_{S}, \bar{p})

extends

(Ω_{S}, p)

in accordance with the definition of

K (Γ)

.

Since the limiting subsystem is not necessarily minimal and

B_{0, u}

has not necessarily a structure corresponding to p, eliminating coordinates of the state and adapting the corresponding structure indices p may result in a pair

({\bar{Ω}}_{S}, \bar{p})

that is smaller than the pair

({\tilde{Ω}}_{S}, \tilde{p})

corresponding to an element of

K (Γ)

.

Appendix B.2. Proof of Theorem 4

The multi-index

Γ

contains three components:

Ω_{S}, p, α_{•}

. For given

Ω_{S}

the selection of the structures indices

p_{max}

introducing the fewest restrictions, such that in its boundary all possible p.u.t. matrices occur, was discussed in Section 4.2. Choosing this maximal element

p_{max}

then implies that all systems of given state space unit root structure correspond to a multi-index that is smaller than or equal to

(Ω_{S}, p_{max}, β_{•})

, where

β_{•}

is a Kronecker index corresponding to state space dimension

n_{•}

. For the Kronecker indices of order

n_{•}

it is known that there exists one index

α_{•, g}

such that

M_{α_{•, g}}

is open and dense in

\bar{M_{n_{•}}}

. The set

M_{Ω_{S}, p_{max}, β_{•}}

is, therefore, contained in

\bar{M_{Ω_{S}, p_{max}, α_{•, g}}}

which implies (14) with

Γ_{g} (Ω_{S}, n_{•}) : = (Ω_{S}, p_{max}, α_{•, g})

.

For the second claim choose an arbitrary state space realization

(A, B, C)

in canonical form such that

π (A, B, C) \in M (Ω_{S}, n_{•})

for arbitrary

Ω_{S}

. Define the sequence

{(A_{j}, B_{j}, C_{j})}_{j \in N}

by

A_{j} = (1 - j^{- 1}) A

,

B_{j} = (1 - j^{- 1}) B

,

C_{j} = C

. Then

λ_{| \max |} (A_{j}) < 1

holds for all j, which implies

π (A_{j}, B_{j}, C_{j}) \in \bar{M_{Γ_{α_{•, g} (n)}}}

for every

n \geq n_{u} (Ω_{s}) + n_{•}

and every j. The continuity of

π

implies

π (A, B, C) = \lim_{j \to \infty} π (A_{j}, B_{j}, C_{j}) \in \bar{M_{Γ_{α_{•, g} (n)}}}

.

Appendix B.3. Proof of Theorem 5

(i)

Assume that there exists a sequence

k_{i} \in \bar{M_{Γ}}

converging to a transfer function

k_{0} \in M_{Γ}

. For such a sequence the size of the Jordan blocks for every unit root are identical from some

i_{0}

onwards since eigenvalues depend continuously on the matrices (cf. Chatelin 1993): Thus, the stable part of the transfer functions

k_{i}

must converge to the stable part of the transfer function

k_{0}

, since the sum of the algebraic multiplicity of all eigenvalues inside the open unit disc cannot drop in the limit. Since

V_{α}

(the set of all stable transfer functions with Kronecker index

α

) is open in

\bar{V_{α}}

according to Hannan and Deistler (1988, Theorem 2.5.3) this implies that the stable part of

k_{i}

has Kronecker index

α_{•}

from some

i_{0}

onwards.

For the unstable part of the transfer function note that in

M_{Γ}

for every unit root

z_{j}

the rank of

{(A - z_{j} I_{n})}^{r}

is equal for every r. Thus, the maximum over

\bar{M_{Γ}}

cannot be larger due to lower semi-continuity of the rank. It follows that for

k_{i} \to k_{0}

the ranks of

{(A - z_{j} I_{n})}^{r}

for all

| z_{j} | = 1

and for all

r \in N_{0}

are identical to the ranks corresponding to

k_{0}

from some point onwards showing that

k_{i}

has the same state space unit root structure as

k_{0}

from some

i_{0}

onwards. Finally, the p.u.t. structure of sub-blocks of

B_{k}

clearly introduces an open set being defined via strict inequalities. This shows that

k_{i} \in M_{Γ}

from some

i_{0}

onwards implying that

M_{Γ}

is open in

\bar{M_{Γ}}

.

(ii)

The first inclusion was shown in Theorem 3. Comparing Definitions 10 and 11 we see

⋃_{\tilde{Γ} \in K (Γ_{g})} M_{\tilde{Γ}} \subset ⋃_{({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) \in A (Ω_{S}, n_{•})} M ({\tilde{Ω}}_{S}, {\tilde{n}}_{•})

. By the definition of the partial ordering (compare Definition 9)

⋃_{\tilde{Γ} \leq Γ_{g}} M_{\tilde{Γ}} \subset ⋃_{({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) \leq (Ω_{S}, n_{•})} M ({\tilde{Ω}}_{S}, {\tilde{n}}_{•})

holds. Together these two statements imply the second inclusion.

⋃_{({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) \in A (Ω_{S}, n_{•})} ⋃_{({\overset{ˇ}{Ω}}_{S}, {\overset{ˇ}{n}}_{•}) \leq ({\tilde{Ω}}_{S}, {\tilde{n}}_{•})} M ({\overset{ˇ}{Ω}}_{S}, {\overset{ˇ}{n}}_{•}) \subset \bar{M_{Γ_{g} (Ω_{s}, n_{•})}}

is a consequence of the following two statements:

(a): If $M ({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) \subset \bar{M (Ω_{S}, n_{•})}$ , then $⋃_{({\overset{ˇ}{Ω}}_{S}, {\overset{ˇ}{n}}_{•}) \leq ({\tilde{Ω}}_{S}, {\tilde{n}}_{•})} M ({\overset{ˇ}{Ω}}_{S}, {\overset{ˇ}{n}}_{•}) \subset \bar{M (Ω_{S}, n_{•})}$ .
(b): If $({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) \in A (Ω_{S}, n_{•})$ , then $M ({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) \subset \bar{M (Ω_{S}, n_{•})}$ .

For (a) note that for an arbitrary transfer function

\overset{ˇ}{k} \in M ({\overset{ˇ}{Ω}}_{S}, {\overset{ˇ}{n}}_{•})

with

({\overset{ˇ}{Ω}}_{S}, {\overset{ˇ}{n}}_{•}) \leq ({\tilde{Ω}}_{S}, {\tilde{n}}_{•})

there is a multi-index

\overset{ˇ}{Γ}

such that

\overset{ˇ}{k} \in M_{\overset{ˇ}{Γ}}

. By the definition of the partial ordering (compare Definition 9) we find a multi-index

\tilde{Γ} \geq \overset{ˇ}{Γ}

such that

M_{\tilde{Γ}} \subset M ({\tilde{Ω}}_{S}, {\tilde{n}}_{•})

. By Theorem 3 and the continuity of

π

we have

M_{\overset{ˇ}{Γ}} \subset π (\bar{Δ_{\tilde{Γ}}}) \subset \bar{M_{\tilde{Γ}}}

. Since

\bar{M ({\tilde{Ω}}_{S}, {\tilde{n}}_{•})} \subset \bar{M (Ω_{S}, n_{•})}

by assumption,

\overset{ˇ}{k} \in \bar{M_{\tilde{Γ}}} \subset \bar{M ({\tilde{Ω}}_{S}, {\tilde{n}}_{•})} \subset \bar{M (Ω_{S}, n_{•})}

which finishes the proof of (a).

With respect to (b) note that by Definition 11,

A (Ω_{S}, n_{•})

contains transfer functions with two types of state space unit root structures. First,

{\tilde{A}}_{u}

corresponding to state space unit root

{\tilde{Ω}}_{S}

may be of the form

\begin{matrix} S {\tilde{A}}_{u} S^{'} & = & [\begin{matrix} A_{u} & J_{12} \\ 0 & J_{2} \end{matrix}] . \end{matrix}

(A1)

Second,

{\overset{ˇ}{A}}_{u}

corresponding to state space unit root

{\overset{ˇ}{Ω}}_{S}

may be of the form (A1) where off-diagonal elements of

A_{u}

are replaced by zero. To prove (b) we need to show that for both cases the corresponding transfer function is contained in

\bar{M (Ω_{S}, n_{•})}

.

We start by showing that in the second case the transfer function

\overset{ˇ}{k}

is contained in

\bar{M ({\tilde{Ω}}_{S}, {\tilde{n}}_{•})}

, where

{\tilde{Ω}}_{S}

is the state space unit root structure corresponding to

{\tilde{A}}_{u}

in (A1). For this, consider the sequence

\begin{matrix} A_{j} = [\begin{matrix} 1 & j^{- 1} \\ 0 & 1 \end{matrix}], B_{j} = [\begin{matrix} B_{1} \\ B_{2} \end{matrix}], C_{j} = [\begin{matrix} C_{1} & C_{2} \end{matrix}] . \end{matrix}

Clearly, every system

(A_{j}, B_{j}, C_{j})

corresponds to an

I (2)

process, while the limit for

j \to \infty

corresponds to an

I (1)

process. This shows that it is possible in the limit to trade one

I (2)

component with two

I (1)

components leading to more transfer functions in the

T_{p t}

closure of

M_{Γ_{g} (Ω_{S}, n_{•})}

than only the ones included in

π (\bar{Δ_{Γ_{g} (Ω_{S}, n_{•})}})

, where the off-diagonal entry in

A_{j}

is restricted to equal one and hence the corresponding sequence of systems in the canonical form diverges to infinity. In a sense these systems correspond to “points at infinity”: For the example given above we obtain the canonical form

\begin{matrix} A_{j} = [\begin{matrix} 1 & 1 \\ 0 & 1 \end{matrix}], B_{j} = [\begin{matrix} B_{1} \\ B_{2} / j \end{matrix}], C_{j} = [\begin{matrix} C_{1} & j C_{2} \end{matrix}] . \end{matrix}

Thus, the corresponding parameter vector for the entries in

B_{j, 2}

converges to zero and the ones corresponding to

C_{j, 2}

to infinity.

Generalizing this argument shows that every transfer function corresponding to a pair

({\overset{ˇ}{Ω}}_{S}, {\overset{ˇ}{n}}_{•})

in

A ({\tilde{Ω}}_{S}, {\tilde{n}}_{•})

, where

{\overset{ˇ}{A}}_{u}

can be obtained by replacing off-diagonal entries of

A_{u}

with zero, can be reached from within

M ({\tilde{Ω}}_{S}, {\tilde{n}}_{•})

.

To prove

\tilde{k} \in \bar{M (Ω_{S}, n_{•})}

in the first case, where the state space unit root structure is extended as visible in Equation (A1), consider the sequence:

\begin{matrix} {\tilde{A}}_{j} = [\begin{matrix} 1 & 1 \\ 0 & 1 - j^{- 1} \end{matrix}], {\tilde{B}}_{j} = [\begin{matrix} B_{1} \\ B_{2} \end{matrix}], {\tilde{C}}_{j} = [\begin{matrix} C_{1} & C_{2} \end{matrix}], \end{matrix}

corresponding to the following system in canonical form (except that the stable subsystem is not necessarily in echelon canonical form)

\begin{matrix} {\tilde{A}}_{j} = [\begin{matrix} 1 & 0 \\ 0 & 1 - j^{- 1} \end{matrix}], {\tilde{B}}_{j} = [\begin{matrix} B_{1} + j B_{2} \\ - j B_{2} \end{matrix}], {\tilde{C}}_{j} = [\begin{matrix} C_{1} & C_{1} - C_{2} / j \end{matrix}] . \end{matrix}

This sequence shows that there exists a sequence of transfer functions corresponding to

I (1)

processes with one common trend that converge to a transfer function corresponding to an

I (2)

system. Again, in the canonical form this cannot happen as there the

(1, 2)

entry of

{\tilde{A}}_{j}

would be restricted to be equal to zero. At the same time note that the dimension of the stable system is reduced due to one component of the state changing from the stable to the unit root part.

Now for a unit root structure

{\tilde{Ω}}_{S}

such that

({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) \in A (Ω_{S}, n_{•})

, satisfying

\begin{matrix} S {\tilde{A}}_{u} S^{'} & = & [\begin{matrix} A_{u} & J_{12} \\ 0 & J_{2} \end{matrix}], \end{matrix}

the Jordan blocks corresponding to

Ω_{S}

are sub-blocks of the ones corresponding to

{\tilde{Ω}}_{S}

, potentially involving a reordering of coordinates using the permutation matrix S. Taking as the approximating sequence of transfer functions

{\tilde{k}}_{j} \in M_{Γ_{g} (Ω_{S}, n_{•})} \to k_{0} \in M_{Γ_{g} ({\tilde{Ω}}_{S}, {\tilde{n}}_{•})}

that have the same structure

{\tilde{Ω}}_{S}

but replacing

J_{2}

by

\frac{j - 1}{j} J_{2}

leads to processes with state space unit root structure

Ω_{S}

.

For the stable part of

{\tilde{k}}_{j}

we can separate the part containing poles tending to the unit circle (contained in

J_{2}

) and the remaining transfer function

{\tilde{k}}_{j, s}

, which has Kronecker indices

\tilde{α} \leq α_{•}

. However, the results of Hannan and Deistler (1988, Theorem 2.5.3) then imply that the limit remains in

\bar{M_{α_{•}}}

and hence allows for an approximating sequence in

M_{α_{•}}

.

Both results combined constitute the whole set of attainable state space unit root structures in Definition 11 and prove (b).

As follows from Corollary 4,

\bar{M (Ω_{S}, n_{•})} = \bar{M_{Γ_{g} (Ω_{S}, n_{•})}}

. Thus, (b) implies

⋃_{({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) \in A (Ω_{S}, n_{•})} M ({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) \subset \bar{M_{Γ_{g} (Ω_{S}, n_{•})}}

and (a) adds the second union showing the subset inclusion.

It remains to show equality for the last set inclusion. Thus, we need to show that for

k_{j} \in M_{Γ_{g} (Ω_{S}, n_{•})}, k_{j} \to k_{0}

, it holds that

k_{0} \in M ({\tilde{Ω}}_{S}, {\tilde{n}}_{•})

, where

({\tilde{Ω}}_{S}, {\tilde{n}}_{•}) \leq ({\overset{ˇ}{Ω}}_{S}, {\overset{ˇ}{n}}_{•}) \in A (Ω_{S}, n_{•})

. To this end note that the rank of a matrix is a lower semi-continuous function such that for a sequence of matrices

E_{j}

with limit

E_{0}

, we have

\begin{matrix} rank (lim_{j \to \infty} E_{j}) = rank (E_{0}) \leq lim inf_{j \to \infty} rank (E_{j}) . \end{matrix}

Then, consider a sequence

k_{j} (z) \in M_{Γ_{g} (Ω_{s}, n_{•})}, j \in N

. We can find a converging sequence of systems

(A_{j}, B_{j}, C_{j})

realizing

k_{j} (z)

. Therefore, choosing

E_{j} = {(A_{j} - z_{k} I_{n})}^{r}

we obtain that

\begin{matrix} rank ({(A_{0} - z_{k} I_{n})}^{t}) & \leq & n - \sum_{r = 1}^{t} d_{j, h_{k} - r + 1}^{k}, \end{matrix}

since

k_{j} (z) \in M_{Γ_{g} (Ω_{s}, n_{•})}

implies that the number

d_{j, h_{k} - r + 1}^{k}

of the generalized eigenvalues at the unit roots is governed by the entries of the state space unit root structure

Ω_{s}

. This implies that

\sum_{r = 1}^{t} d_{j, h_{k} - r + 1}^{k} \leq \sum_{r = 1}^{t} d_{0, h_{k} - r + 1}^{k}

for

t = 1, 2, . . ., n

. Consequently, the limit has at least as many chains of generalized eigenvalues of each maximal length as dictated by the state space unit root structure

Ω_{S}

for each unit root of the limiting system.

Rearranging the rows and columns of the Jordan normal form using a permutation matrix S it is then obvious that either the limiting matrix

A_{0}

has additional eigenvalues, where thus

\begin{matrix} S A_{0} S^{'} & = & [\begin{matrix} A_{j} & {\tilde{J}}_{12} \\ 0 & {\tilde{J}}_{2} \end{matrix}] \end{matrix}

must hold. Or upper diagonal entries in

A_{j}

must be changed from ones to zeros in order to convert some of the chains to lower order. One example in this respect was given above: For

A_{j} = [\begin{matrix} 1 & 1 / j \\ 0 & 1 \end{matrix}]

the rank of

{(A_{j} - I_{2})}^{r}

is equal to 1 for

r = 1

and 0 for

r = 2

. For the limit we obtain

A_{0} = I_{2}

and hence the rank is zero for

r = 1, 2

. The corresponding indices are

d_{j, 1}^{1} = 1, d_{j, 2}^{1} = 1

for the approximating sequence and

d_{0, 1}^{1} = 0, d_{0, 2}^{1} = 2

for the limit respectively. Summing these indices starting from the last one, one obtains

d_{j, 2}^{1} = 1 \leq d_{0, 2}^{1} = 2

and

d_{j, 1}^{1} + d_{j, 2}^{1} = 2 \leq d_{0, 1}^{1} + d_{0, 2}^{1} = 2

.

Hence the state space unit root structure corresponding to

(A_{0}, B_{0}, C_{0})

must be attainable according to Definition 11. The number of stable state components must decrease accordingly.

Finally, the limiting system

(A_{0}, B_{0}, C_{0})

is potentially not minimal. In this case the pair

({\tilde{Ω}}_{S}, {\tilde{n}}_{•})

is reduced to a smaller one, concluding the proof.

References

Amann, Herbert, and Joachim Escher. 2008. Analysis III. Basel: Birkhäuser Basel. [Google Scholar]
Aoki, Massanao. 1990. State Space Modeling of Time Series. New York: Springer. [Google Scholar]
Bauer, Dietmar, and Martin Wagner. 2003. On Polynomial Cointegration in the State Space Framework. Mimeo. [Google Scholar] [CrossRef]
Bauer, Dietmar, and Martin Wagner. 2005. Autoregressive Approximations of Multiple Frequency I(1) Processes. IHS Economics Series, Institut für Höhere Studien–Institute for Advanced Studies (IHS) Vienna, No. 174. Available online: http://hdl.handle.net/10419/72306 (accessed on 3 November 2020).
Bauer, Dietmar, and Martin Wagner. 2012. A State Space Canonical Form for Unit Root Processes. Econometric Theory 28: 1313–49. [Google Scholar] [CrossRef]
Boswijk, H. Peter, and Paolo Paruolo. 2017. Likelihood Ratio Tests of Restrictions on Common Trends Loading Matrices in I(2) VAR Systems. Econometrics 5: 28. [Google Scholar] [CrossRef]
Campbell, John Y. 1994. Inspecting the Mechanism: An Analytical Approach to the Stochastic Growth Model. Journal of Monetary Economics 33: 463–506. [Google Scholar] [CrossRef]
Chatelin, Françoise. 1993. Eigenvalues of Matrices. New York: John Wiley & Sons. [Google Scholar]
Engle, Robert F., and Clive W.J. Granger. 1987. Cointegration and Error Correction: Representation, Estimation and Testing. Econometrica 55: 251–76. [Google Scholar] [CrossRef]
Golub, Gene H., and Charles F. van Loan. 1996. Matrix Computations, 3rd ed. Baltimore: The Johns Hopkins University Press. [Google Scholar]
Granger, Clive W.J. 1981. Some Properties of Time Series Data and Their Use in Econometric Model Specification. Journal of Econometrics 16: 121–30. [Google Scholar] [CrossRef]
Hannan, Edward J., and Manfred Deistler. 1988. The Statistical Theory of Linear Systems. New York: John Wiley & Sons. [Google Scholar]
Hazewinkel, Michiel, and Rudolf E. Kalman. 1976. Invariants, Canonical Forms and Moduli for Linear, Constant, Finite Dimensional, Dynamical Systems. In Mathematical Systems Theory. Edited by Giovanni Marchesini and Sanjoy Kumar Mitter. Berlin: Springer, chp. 4. pp. 48–60. [Google Scholar]
Jensen, Andreas N. 2013. The Nesting Structure of the Cointegrated Vector Autoregressive Models. Paper presented at the QED Conference 2013, Vienna, Austria, May 3–4. [Google Scholar]
Johansen, Søren. 1991. Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian Vector Autoregressive Models. Econometrica 59: 1551–80. [Google Scholar] [CrossRef]
Johansen, Søren. 1995. Likelihood-Based Inference in Cointegrated Vector Auto-Regressive Models. Oxford: Oxford University Press. [Google Scholar]
Johansen, Søren. 1997. Likelihood Analysis of the I(2) Model. Scandinavian Journal of Statistics 24: 433–62. [Google Scholar] [CrossRef]
Johansen, Søren. 2006. Statistical Analysis of Hypotheses on the Cointegrating Relations in the I(2) Model. Journal of Econometrics 132: 81–115. [Google Scholar] [CrossRef]
Johansen, Søren, and Morton ∅. Nielsen. 2018. The Cointegrated Vector Autoregressive Model with General Deterministic Terms. Journal of Econometrics 202: 214–29. [Google Scholar] [CrossRef]
Johansen, Søren, and Ernst Schaumburg. 1999. Likelihood Analysis of Seasonal Cointegration. Journal of Econometrics 88: 301–39. [Google Scholar] [CrossRef]
Juselius, Katarina. 2006. The Cointegrated VAR Model: Methodology and Applications. Oxford: Oxford University Press. [Google Scholar]
Lewis, Richard, and Gregory C. Reinsel. 1985. Prediction of Multivariate Time Series by Autoregressive Model Fitting. Journal of Multivariate Analysis 16: 393–411. [Google Scholar] [CrossRef]
Otto, Markus. 2011. Rechenmethoden für Studierende der Physik im ersten Jahr. Heidelberg: Spektrum Akademischer Verlag. [Google Scholar]
Paruolo, Paolo. 1996. On the Determination of Integration Indices in I(2) Systems. Journal of Econometrics 72: 313–56. [Google Scholar] [CrossRef]
Paruolo, Paolo. 2000. Asymptotic Efficiency of the Two Stages Estimator in I(2) Systems. Econometric Theory 16: 524–50. [Google Scholar] [CrossRef]
Poskitt, Donald S. 2006. On the Identification and Estimation of Nonstationary and Cointegrated ARMAX Systems. Econometric Theory 22: 1138–75. [Google Scholar] [CrossRef]
Rahbek, Anders, Hans C. Kongsted, and Clara Jorgensen. 1999. Trend Stationarity in the I(2) Cointegration Model. Journal of Econometrics 90: 265–89. [Google Scholar] [CrossRef]
Saikkonen, Pentti. 1992. Estimation and Testing of Cointegrated Systems by an Autoregressive Approximation. Econometric Theory 8: 1–27. [Google Scholar] [CrossRef]
Saikkonen, Pentti, and Ritva Luukkonen. 1997. Testing Cointegration in Infinite Order Vector Autoregressive Processes. Journal of Econometrics 81: 93–126. [Google Scholar] [CrossRef]
Sims, Christopher A., James H. Stock, and Mark W. Watson. 1990. Inference in Linear Time Series Models with Some Unit Roots. Econometrica 58: 113–44. [Google Scholar] [CrossRef]
Wagner, Martin. 2018. Estimation and Inference for Cointegrating Regressions. In Oxford Research Encyclopedia of Economics and Finance. Oxford: Oxford University Press. [Google Scholar]
Wagner, Martin, and Jaroslava Hlouskova. 2009. The Performance of Panel Cointegration Methods: Results from a Large Scale Simulation Study. Econometric Reviews 29: 182–223. [Google Scholar] [CrossRef]
Zellner, Arnold, and Franz C. Palm. 1974. Time Series Analysis and Simultaneous Equation Econometric Models. Journal of Econometrics 2: 17–54. [Google Scholar] [CrossRef]

1	Please note that the original contribution to the estimation of cointegrating relationship has been least squares estimation in a non- or semi-parametric regression setting, see, e.g., Engle and Granger (1987). A recent survey of regression-based cointegration analysis is provided by Wagner (2018).
2	The complexity of these inter-relations is probably well illustrated by the fact that only Jensen (2013) notes that “even though the I(2) models are formulated as submodels of I(1) models, some I(1) models are in fact submodels of I(2) models”.
3	The literature often uses VAR models as approximations, based on the fact that VARMA processes often can be approximated by VAR models with the order tending to infinity with the sample size at certain rates. This line of work goes back to Lewis and Reinsel (1985) for stationary processes and was extended to (co)integrated processes by Saikkonen (1992), Saikkonen and Luukkonen (1997) and Bauer and Wagner (2005). In addition to the issue of the existence and properties of a sequence of VAR approximations, the question whether a VAR approximation is parsimonious remains.
4	Below we often use the term “likelihood” as short form of “likelihood function”.
5	We are confident that this dual usage of notation does not lead to confusion.
6	Our definition of VAR processes differs to a certain extent from some widely used definitions in the literature. Given our focus on unit root and cointegration analysis we, unlike Hannan and Deistler (1988), allow for determinantal roots at the unit circle that, as is well known, lead to integrated processes. We also include deterministic components in our definition, i.e., we allow for a special case of exogenous variables, compare also Remark 2 below. There is, however, also a large part of the literature that refers to this setting simply as (cointegrated) vector autoregressive models, see, e.g., Johansen (1995) and Juselius (2006).
7	Of course, the statistical properties of the parameter estimators depend in many ways on the deterministic components.
8	The set $V_{p}$ is endowed with the pointwise topology $T_{p t}$ , defined in Section 3. For now, in the context of VAR models, it suffices to know that convergence in pointwise topology is equivalent to convergence of the VAR coefficient matrices $a_{1}, \dots, a_{p}$ in the Frobenius norm.
9	Please note that in case of restricted estimation, i.e., zero restrictions or cross-equation restrictions, OLS is not asymptotically equivalent to PML in general.
10	A similar property holds for $V_{p, r}^{R R R}$ being a “thin” subset of $V_{p}^{O L S}$ . This implies that the probability that the OLS estimator calculated over $V_{p}^{O L S}$ corresponds to an element $V_{p, r}^{R R R} \subset V_{p}^{O L S}$ is equal to zero in general.
11	Below Example 3 we clarify how these indices are related to the state space unit root structure defined in Bauer and Wagner (2012, Definition 2) and link these to the dimensions of the cointegrating spaces in Section 5.2.
12	Uniqueness of realizations in the VAR case stems from the normalization $m (z) b (z) = I_{s}$ , which reduces the class of observationally equivalent VAR realizations of the same transfer function $k (z) = a {(z)}^{- 1} b (z)$ , with $b (z) = I_{s}$ , to a singleton.
13	The pair $(a (z), b (z))$ is left coprime if all its left divisors are unimodular matrices. Unimodular matrices are polynomial matrices with constant non-zero determinant. Thus, pre-multiplication of, e.g., $a (z)$ with a unimodular matrix $u (z)$ does not affect the determinantal roots that shape the dynamic behavior of the solutions of VAR models.
14	When using the echelon canonical form, the partitioning is according to the so-called Kronecker indices related to a basis selection for the row-space of the Hankel matrix corresponding to the transfer function $k (z)$ , see, e.g., Hannan and Deistler (1988, chp. 2.4) for a precise definition.
15	Here and below we will only consider state space systems in so-called innovation representation, with the same error in both the output equation and the state equation. Since every state space system has an innovation representation this is no restriction, compare Aoki (1990, chp. 7.1).
16	The definition of cointegrating spaces as linear subspaces allows to characterize them by a basis and implies a well-defined dimension. These advantages, however, have the implication that the zero vector is an element of all cointegrating spaces, despite not being a cointegrating vector in our definition, where the zero vector is excluded. This issue is well-known of course in the cointegration literature.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.

A Parameterization of Models for Unit Root Processes: Structure Theory and Hypothesis Testing

Abstract

1. Introduction

2. Vector Autoregressive, Vector Autoregressive Moving Average Processes and Parameterizations

3. The Canonical Form and the Parameterization

3.1. The Parameterization in the MFI(1) Case

Components of the Parameter Vector

3.2. The Parameterization in the I(2) Case

Components of the Parameter Vector

3.3. The Parameterization in the General Case

4. The Topological Structure

4.1. The Closure of Δ Γ in Δ n

4.1.1. The Closure of Δ Ω S , p

4.1.2. The Closure of Δ α •

4.1.3. The Conformable Index Set and the Closure of Δ Γ

4.2. The Closure of M Γ

5. Testing Commonly Used Hypotheses in the MFI(1) and I(2) Cases

5.1. The M F I ( 1 ) Case

5.1.1. Testing Hypotheses on the State Space Unit Root Structure

5.1.2. Testing Hypotheses on CIVs and PCIVs

5.1.3. Testing Hypotheses on the Adjustment Coefficients

5.1.4. Restrictions on the Deterministic Components

5.2. The I ( 2 ) Case

5.2.1. Testing Hypotheses on the State Space Unit Root Structure

5.2.2. Testing Hypotheses on CIVs and PCIVs

5.2.3. Testing Hypotheses on the Adjustment Coefficients

5.2.4. Restrictions on the Deterministic Components

6. Summary and Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Proofs of the Results of Section 3

Appendix A.1. Proof of Lemma 1

Appendix A.2. Proof of Lemma 2

Appendix A.3. Proof of Theorem 2

Appendix B. Proofs of the Results of Section 4

Appendix B.1. Proof of Theorem 3

Appendix B.2. Proof of Theorem 4

Appendix B.3. Proof of Theorem 5

References

Article Metrics

Article Access Statistics

4.1. The Closure of $Δ_{Γ}$ in $Δ_{n}$

4.1.1. The Closure of $Δ_{Ω_{S}, p}$

4.1.2. The Closure of $Δ_{α_{•}}$

4.1.3. The Conformable Index Set and the Closure of $Δ_{Γ}$

4.2. The Closure of $M_{Γ}$

5.1. The $M F I (1)$ Case

5.2. The $I (2)$ Case