Cointegration, Root Functions and Minimal Bases

: This paper discusses the notion of cointegrating space for linear processes integrated of any order. It ﬁrst shows that the notions of (polynomial) cointegrating vectors and of root functions coincide. Second, it discusses how the cointegrating space can be deﬁned (i) as a vector space of polynomial vectors over complex scalars, (ii) as a free module of polynomial vectors over scalar polynomials, or ﬁnally (iii) as a vector space of rational vectors over rational scalars. Third, it shows that a canonical set of root functions can be used as a basis of the various notions of cointegrating space. Fourth, it reviews results on how to reduce polynomial bases to minimal order—i.e., minimal bases. The application of these results to Vector AutoRegressive processes integrated of order 2 is found to imply the separation of polynomial cointegrating vectors from non-polynomial ones.


Introduction
In their seminal paper, Engle and Granger (1987) introduced the notion of cointegration and of cointegrating (CI) rank for processes integrated of order 1, or I(1). They did this in the following way: 1 DEFINITION: The components of the vector x t , are said to be co-integrated of order d, b, denoted x t ∼ CI(d, b), if (i) all components of x t , are I(d); (ii) there exists a vector β( = 0) so that z t = β x t ∼ I(d − b), b > 0. The vector β is called the co-integrating vector.
[...] If x t has p components, then there may be more than one co-integrating vector β. It is clearly possible for several equilibrium relations to govern the joint behavior of the variables. In what follows, it will be assumed that there are exactly r linearly independent co-integrating vectors, with r ≤ p − 1, which are gathered together into the p × r array β. By construction the rank of β will be r which will be called the "co-integrating rank" of x t . Engle and Granger (1987) did not define explicitly the notion of cointegrating space, but just the cointegrating rank, which corresponds to its dimension; explicit mention of the cointegrating space was first made in Johansen (1988).
The Granger representation theorem in Engle and Granger (1987) showed that the cointegration matrix β needs to be orthogonal to the Moving Average (MA) impact matrix of ∆x t . More precisely, for ∆x t = C(L)ε t , the MA impact matrix C(1) has rank equal to p − r and representation C(1) = β ⊥ a , where β ⊥ is a basis of the orthogonal complement of the space spanned by the columns of β and a is full column rank. Johansen (1991Johansen ( , 1992 stated the appropriate conditions under which the Granger representation theorem holds for I(1) and I(2) Vector AutoRegressive processes (VAR) A(L)x t = ε t , where the AR impact matrix A(1) has rank equal to r < p and rank factorization A(1) = −αβ , with α and β of full column rank. He defined the cointegrating space as the vector space generated by the column vectors β j in β over the field of real numbers R.
sponding to different choices of the set of vectors and scalars. The sets of vectors are chosen here to be either the set of polynomial vectors or the one of rational vectors, while the set of scalars are taken to be (i) the field F = R, C, (ii) the ring of polynomials with coefficients in F (denoted F[z]) or (iii) the field of rational function with coefficients in F (denoted F(z)). The resulting spaces are either vector spaces, in cases (i) and (iii), or a free module in case (ii). The relationship among their bases is discussed following Forney (1975), whose results are used to derive a polynomial basis of minimal degree-i.e., a minimal basis.
The focus of this paper is on the parsimonious representation of the set of cointegrating vectors. As noted by a referee, the present results may find application also in the parametrization and estimation of I(d) EC systems. This, however, is beyond the scope of the present paper.
The rest of the paper is organised as follows. Section 2 provides the motivation for the paper. Section 3 reports definitions of integration and cointegration in I(d) systems, where the cointegrating vectors ζ(z) = ∑ ∞ j=0 (z − z ω ) j ζ j are allowed to be vector functions; here (z − z ω ) and its powers are associated with the difference operator and its powers. Section 4 defines root functions and canonical systems of root functions and Section 5 discusses possible definitions of the cointegration space. Section 6 discusses how to derive bases for the various notions of cointegrating space from VAR coefficients. Section 7 discusses minimal bases using results in Forney (1975) and Section 8 applies these results in order to obtain a minimal basis in the I(2) VAR case. Section 9 concludes; Appendix A reports background results.

Motivation
This section motivates the study of the represention of cointegrating vectors in terms of bases of suitable spaces, for systems integrated of order two, which are more formally introduced in Section 3 below. Let x t be a p × 1 vector process, and let ∆ = 1 − L and L be the (0-frequency) difference and the lag operators. Assume that x t is integrated of order 2, I(2), with ∆ j x t nonstationary for j < 2 and stationary for j ≥ 2. Mosconi and Paruolo (2017) consider the identification problem for the following cointegrating SSE with I(2) variables The first set of r 0 polynomial vectors has coefficient β of order 0 (i.e., that multiplies ∆ 0 ) and coefficient υ of order 1 (i.e., that multiplies ∆ 1 ). The last r 0 + r 1 polynomial vectors have 0 coefficients of order 0 and γ and β coefficients of order 1. They discussed identification of the SSE with respect to transformations corresponding to pre-multiplication of ξ(∆) (or ecm t ) by a block triangular, nonsingular matrix of the form where Q ab are blocks of real coefficients, a, b ∈ {0, γ, β}, with Q 00 and Q γγ nonsingular square matrices. They show that Qξ(∆) = ξ • (∆) has the same structure as ξ(∆) in terms of the null coefficient of order 0 in the last r 1 + r 0 equations, as well as the same β block as the coefficient of order 0 in the first r 0 and as the coefficient of order 1 in the last r 0 rows. More precisely, • β is replaced by β • = Q 00 β , a set of r 0 linear combinations of β , • γ ∆ is replaced by a set of r 1 linear combinations of γ ∆ and β ∆, • υ ∆ is replaced by a set of r 0 linear combinations of υ ∆, γ ∆ and β ∆.
Remark 1 (F-linear combinations). Note that the Q linear combinations have scalars taken from F = R, and that any CI vectors can be obtained as linear combinations with coefficients in F of the rows in ξ(∆) , called in the following 'F-linear combinations'.
The main motivation to study the notion of cointegration space for I(d) processes with d ≥ 2 comes from the following observation. , where a(z) ∈ F[z] has the form a(z) = ∑ n j=0 a j z j for some finite n. To show that the set of F[∆]-linear combinations of ζ(∆) is the same as the set of F-linear combinations of ξ(∆) , it is sufficient to show that the rows of ξ(∆) can be obtained as F[∆]-linear combinations of the rows in ζ(∆) , possibly up to terms of the type c ∆ 2 which generate stationary processes by definition.
Note first that β + υ ∆ is common to ξ(∆) and ζ(∆) . In order to obtain γ ∆ in ξ(∆) from ζ(∆) one needs to select the scalar ∆ from F[∆] and multiply it by γ . Similarly, in order to obtain β ∆ in ξ(∆) one only needs to select the scalar ∆ ∈ F[∆] and multiply it by β + υ ∆ to obtain β ∆ + υ ∆ 2 . Because ∆ 2 x t is stationary by the assumption that x t is I(2), the term υ ∆ 2 can be discarded, and this completes the argument.
The take-away from Remark 2 is that, if one allows the set of multiplicative scalars to contain polynomials, i.e., if one moves from F-linear combinations to F[∆]-linear combinations, then one can reduce the number of rows needed to generate the set of CI vectors: ξ(∆) in fact has 2r 0 + r 1 rows, while the number of rows in ζ(∆) is r 0 + r 1 .
The previous discussion shows that the two sets, F and F[∆], could be used as possible set of scalars in taking linear combinations. The first one, F, is a field (i.e., a division ring), the second one, F[∆], is a ring but not a field because it lacks the multiplicative inverse.
Given that vector spaces require the set of scalars to be a field, one may also consider another possible set of scalars, namely F(∆), the set of rational functions of the type a(∆) = c(∆)/d(∆) with c(∆), d(∆) ∈ F[∆], and d(∆) not identically equal to 0, indicated as d(∆) ≡ 0. This leads to consider three possible choices for the set of scalars: (i) The field F, (ii) the ring F[∆] and (iii) the field F(∆). The rest of the paper discusses relative merits of using any of them.
The above discussion focused on unit roots at z = 1, which are associated to the long run behavior of the process. When data are observed every month or quarter, seasonal unit roots, seasonal cointegration and seasonal error correction have been shown to be useful notions, see Hylleberg et al. (1990). For instance, in the case of quarterly series, the relevant seasonal unit roots are at z = −1 and at z ± i where i is the imaginary unit. These roots are represented as z ω = exp(iω) with 0 ≤ ω < 2π, where z ω = 1, i, −1, −i correspond to ω = 0, 1 2 π, π, 3 2 π. Johansen and Schaumburg (1998) showed that the conditions under which a VAR process allows for seasonal integration (and cointegration) of order 1 are of the same type as for roots at z = 1, except that expansions of the VAR polynomial are performed around each z ω , see their Theorem 3. They also provided the corresponding EC form in their Corollary 2; see also Bauer and Wagner (2012) and the discussion in Remark 9 below.
In general, the conditions for integration of any order d at a point z ω on the unit circle can be shown to be of the same type. This paper hence considers the generic case of a linear process with a generic root on the unit circle z ω = exp(iω), and discusses the notions of cointegration, root functions and minimal bases in this general context. This allows to show that the present results hold for generic frequency ω, 0 ≤ ω < 2π.
Incidentally, the results presented below in Section 6 state the generalization of the Granger and the Johansen Representation Theorems presented in Franchi and Paruolo (2019) for a generic unit root z ω = exp(iω) at any frequency ω.

Setup and Definitions
This section introduces notation and basic definitions of integrated and cointegrated processes.

Linear Processes
Assume that {ε t , t ∈ Z} is a p × 1 i.i.d. sequence, called a noise process, 3 with E(ε t ) = 0 and E(ε t ε s ) = Ω 1 s=t where 1 · is the indicator function, and define the linear process u t = µ t + C(L)ε t , where µ t is a nonstochastic p × 1 vector and C(z) = ∑ ∞ j=0 z j C • j is a p × p matrix function, with coefficient matrices C • j ∈ R p×p . Note that the matrices C • j are defined by an expansion of C(z) around z = 0. The term µ t is nonstochastic, i.e., E(µ t ) = µ t , and can contain deterministic terms. Because E(ε t ) = 0, one sees that E(u t ) = µ t , and hence in the following u t is often written as u t = E(u t ) + C(L)ε t .
The matrix function C(z) = ∑ ∞ j=0 z j C • j is assumed to be finite when z is inside the open disk D(0, 1 + η), η > 0, in C with center at 0 and radius 1 + η > 1, i.e., C(z) is assumed analytic on D(0, 1 + η). Here and in the following | · | indicates the modulus and D (z , ρ) indicates the open disk D(z , ρ) := {z ∈ C : |z − z | < ρ} with center z ∈ C and radius ρ > 0. In this paper C(z) is assumed to be regular on D(0, 1 + η), i.e., C(z) can lose rank only at a finite number of isolated points in D(0, 1 + η).
Because of analyticity of C(z), it can be expanded around any interior point of D(0, 1 + η). In particular, define the point z ω := e iω on the unit circle at frequency ω, ω ∈ [0, 2π), and observe that it lies inside D(0, 1 + η) because η > 0. Hence one can expand C(z) as Note that the matrices C j are defined by an expansion of C(z) around z = z ω , but that the dependence of C j on ω is not included in the notation for simplicity. The analysis of the properties of C(z) is done locally around z = z ω on D(z ω , η), η > 0.
Similarly to C(z), one can consider a scalar function of z, a(z) say, or a 1 × p vector function b(z) taken to be analytic on D(z ω , η), η > 0. This means that a(z) has representation a(z) = ∑ ∞ j=0 (z − z ω ) j a j around z ω and similarly for b(z) . A special case is when a(z) is a polynomial of degree k, a(z) = ∑ k j=0 (z − z ω ) j a j , which corresponds to setting all a j = 0 for j > k. Another special case is given by rational functions a(z) = c(z)/d(z) with c(z) and d(z) polynomials, where d(z) ≡ 0 and z ω is not a root of d(z). Similarly for b(z) .

Integration
The following definition specifies the I ω (0) class of processes as a subset of all linear processes built from the noise sequence ε t , and introduces the notion of I ω (d) processes using the difference operator at frequency ω, ∆ ω := 1 − e −iω L = 1 − z −1 ω L. To simplify notation, the dependence of ∆ ω on the lag operator L is left implicit. Observe also that, because z ω = e iω = 0, z − z ω in the analytic expansions can be expressed as Next, the definition of order of integration is introduced; this is defined as the difference between two nonnegative integer exponents d 1 and d 2 of ∆ ω in the representation that links the process x t with its driving linear process u t . This definition allows for the possibility to have x t integrated of negative order.
Definition 1 (Integrated processes at frequency ω). Let C(z) be analytic on D(0, 1 + η), η > 0, and let ε t be a noise process. If {u t , t ∈ Z}, satisfies u t = E(u t ) + C(L)ε t , then u t is called a linear process; if, in addition, then u t is said to be integrated of order zero at frequency ω, indicated u t ∼ I ω (0).
, then x t is said to be integrated of order d := d 1 − d 2 at frequency ω, indicated x t ∼ I ω (d); in this case x t has representation where C(z) is analytic on D(0, 1 + η), η > 0, and C(z ω ) = 0.
Remark 4 (Mean-0 linear process). The linear process u t in Definition 1 can have any expectation E(u t ), which however, does not play any role in the definition of the x t process. Hence, one can assume that E(u t ) = 0 in Definition 1 without loss of generality.
Leading cases are the ones where either d 1 or d 2 equals 0. Specifically, when 0 = d 1 < d 2 , d = d 1 − d 2 = −d 2 is negative, and (2) reads Remark 7 (Example of I 0 (−1)). As an example, consider the process x t = C(L)ε t with C(L) = 1 − L. Setting ω = 0 one finds that Equation (2) is satisfied with d = −1, i.e., that the process is I 0 (−1). Selecting any other frequency 0 < ω < 2π, one sees that Equation (2) is satisfied for d = 0, i.e., that the order of integration is 0, i.e., I ω (0) for 0 < ω < 2π. This illustrates the fact that a process may have different orders of integration at different frequencies.
Remark 8 (t ∈ Z versus t ∈ N 0 ). Consider the process x t = c + ∑ t j=1 ε t defined only for t ∈ N 0 = N ∪ 0, which satisfies ∆ 0 (x t − c) = ε t for t ∈ N. Consider another process {x t , t ∈ Z} satisfying the same equation ∆ 0 (x t − c) = ε t for t ∈ Z with x t = x t for t ∈ N 0 . The process {x t , t ∈ Z} is I 0 (1) according to Definition 1, and it is suggested to extend this qualification to x t , because it coincides with the x t process on the non-negative integers, x t = x t for t ∈ N 0 . Remark 9 (One or more frequencies). Definition 1 of integration refers to a single frequency ω, but it can be used to cover multiple frequencies. In fact, consider the 'ARMA process with unit root structure', as defined in Bauer and Wagner (2012), i.e., a process x t satisfying D(L)x t = v t where D(L) := ∏ n j=1 ∆ m i ω j for a (finite) set of frequencies ω 1 , . . . , ω n , with v t a stationary ARMA process v t = C(L)ε t with C(exp(iω j )) = 0. They call {(ω j , m j ), j = 1, . . . , n}, the 'unit root structure' of x t , see their Definition 2. This can be obtained using Definition 1 for each ω j in turn, noting that v t being ARMA corresponds to a rational C(z), which is a special case of the definition above. Hylleberg et al. (1990), Gregoir (1999), Johansen and Schaumburg (1998), Bauer and Wagner (2012) consider x t to be real-valued, which implies that integration frequencies ±ω j are 'paired', so that if exp(iω j ) is a unit root of the process, so is exp(−iω j ); this implies that in this case one can pair frequencies ±ω j with 0 < ω j < π and rearrange coefficients so as to obtain real coefficient matrices in EC representations. This is not done in this paper for reasons of simplicity.
Remark 10 (Relation with other definitions). The definition of an I ω (0) (respectively an I ω (d)) process in the present Definition 1 coincides with Definition 3.2 (respectively Definition 3.3) in Johansen (1996) when setting ω = 0 (respectively ω = 0 and d 2 = 0). The present definition also agrees with Definitions 2.1 and 2.2 of integration in Gregoir (1999), both for positive and negative orders and any frequency ω. The definition also agrees with the one in Franchi and Paruolo (2019) when applied to vector processes.
Remark 11 (Entries in C(z)). When ω differs from 0 or π, the point z ω = e iω has a nonzero complex part; hence the matrix C(z ω ) in (1) has complex entries and the coefficient matrices C j in the expansion C(z) = ∑ ∞ j=0 (z − z ω ) j C j are complex even when the coefficients in the expansion around z = 0 are real.
Following Gregoir (1999), the summation operator at frequency ω is defined as Basic properties of the operator are proved in Gregoir (1999); these include where {u t , t ∈ Z} is any sequence over Z.
Remark 12 (Simplifications of ∆ ω and initial values). Take d 1 = d 2 = 1 in (2), which in this case reads ∆ ω x t = ∆ ω u t with u t ∼ I ω (0). Applying the S ω operator on both sides one obtains 4 If one assigns the initial value of x 0 equal to u 0 , one obtains x t = u t , which corresponds to the cancellation of ∆ ω from both sides of (2). The same reasoning applies for generic d 1 , d 2 > 0 to the cancellation of ∆ min(d 1 ,d 2 ) ω from both sides of (2). This shows that one can simplify powers of ∆ ω from both sides of (2) by properly assigning initial values; this cancellation is always implicitly performed in the following, in line with preference for minimal values of d 1 , d 2 as discussed in Remark 6.

Cointegration
Cointegration is the property of (possibly polynomial) linear combinations of x t to have a lower order of integration with respect to the original order of integration of x t at frequency ω. Specifically, consider a nonzero 1 × p row vector function Engle and Granger (1987), the idea is to call ζ(L) cointegrating if ζ(L) x t has lower order of integration than x t , excluding cases such as ζ(L) = ∆ ω a where a by itself does not reduce the order of integration.
This leads to the following definition.
Definition 2 (Cointegrating vector at frequency ω). Let x t ∼ I ω (d) be as in Definition 1, i.e., where g(z) is analytic on D(z ω , η), η > 0, and g(z ω ) = 0 . Given Equation (2), Equation (7) is equivalent to the condition The positive integer s ∈ N is called the order of the cointegrating vector ζ(z) of C(z) at z ω . x t is said to be cointegrated at frequency ω if any cointegrating vector ζ(z) = ∑ ∞ j=0 (z − z ω ) j ζ j can be replaced by ζ(z ω ) = ζ 0 without decreasing the order s in (8); otherwise x t is said to be multicointegrated at frequency ω.

Remark 14 (Entries in cointegrating vectors). Similarly to Remark 11, the coefficient vectors
Remark 15 (d and s). Recall that d (the order of integration) is the difference between the exponents of ∆ ω on the l.h.s. and r.h.s. of (2). When pre-multiplied by ζ(L) , the exponent on the r.h.s. decreases by s and the difference of the exponents on the l.h.s. and r.h.s. of (7) becomes d − s.
The condition g(z ω ) = 0 guarantees that no remaining additional power of ∆ ω can be factored from C(L) using ζ(L) .

Remark 16 (Examples of cointegration vectors).
Take ζ(L) = ζ 0 with ζ 0 chosen in (col C(1)) ⊥ , and note that this implies s ≥ 1 in (7). This shows that the definition contains the I 0 (1) definition of cointegrating vectors as a special case.
The usual definition of cointegration, see Definition 3.4 in Johansen (1996), considers a p × 1 process x t ∼ I 0 (1) and defines x t cointegrated with cointegrating vector ζ = 0 if ζ x t "can be made stationary by a suitable choice of initial distribution". The following proposition clarifies that his definition coincides with the one in this paper. Johansen (1996)). ζ is a cointegrating vector in the sense of Definition 3.4 in Johansen (1996) if and only if Definition 2 is satisfied with

Proposition 1 (Relation with Definition 3.4 in
Proof. For simplicity and without loss of generality, set E(x t ) = 0 and omit the subscript ω = 0. Assume Definition 2 is satisfied with ω = 0 and ζ(z) = ζ , d = 1, and s ∈ N, i.e., see Remark 12, and set v t := ∆ s−1 g(L)ε t . Applying S to both sides of Equation ( Note that v t is stationary for any s ∈ N, and hence the initial values ζ x 0 can be chosen equal to v 0 , so as to obtain ζ x t = v t , a stationary process. Conversely, assume that ζ is a cointegrating vector in the sense of Definition 3.4 in Johansen (1996). Because x t ∼ I(1), one has ∆x t = C(L)ε t , see Definition 1, with C(z) analytic on a disk D(0, 1 + η), η > 0, which admits expansion C(z) = C + C(z)(1 − z) around 1, where C(z) is analytic on the same disc. A necessary and sufficient condition for cointegration in the sense of Definition 3.4 in Johansen (1996) is that ζ C = 0 as shown in Johansen (1988) Equation (17); see also Engle and Granger (1987, p. 256). 5 Hence one finds ζ ∆x t = ∆g(L) ε t with g(z) := ζ C(z), which is analytic on D(0, 1 + η), η > 0, and hence also on D(1, η), η > 0. By Corollary 1 below, one has that g(z) satisfies g(z) = ∆ m g(z) with finite m ∈ N 0 and g(z ω ) = 0 . This shows that Definition 2 is satisfied with ζ(z) = ζ , d = 1, and s = m + 1 ∈ N.
Remark 17 (ζ x t can have negative order of integration). Johansen (1996) makes the following observation just after his Definition 3.4: "Note that ζ x t need not be I(0)", which recognises that ζ x t can have negative order of integration. This is indeed the case when to s = 2, 3, . . . in Definition 2, because ζ x t ∼ I(1 − s).
Remark 18 (Relation to other definitions in the literature). The definition of cointegration in Engle and Granger (1987) reported in the introduction is a special case of the present one with ζ(z) = ζ 0 a constant vector and ω = 0, under the additional requirement that all variables are integrated of the same order. For more details on this for the case ω = 0, see Franchi and Paruolo (2019). When s > 1 and ω = 0, Definition 2 covers the definitions of multicointegration and polynomial cointegration in Granger and Lee (1989), Engle and Yoo (1991), Johansen (1996). When s = 1 and ω = 2πj/n for j = 1, . . . , n where n is the number of seasons, the definition covers seasonal cointegration in Hylleberg et al. (1990), Johansen and Schaumburg (1998).
Here and in the following, let a ⊥ indicate a basis of the orthogonal complement of the linear space spanned by the columns of the matrix a. Moreover P a := a(a a) −1 a for a full-column-rank matrix a is the orthogonal projection matrix onto col(a). Johansen (1991) (see his Equations (4.3) and (4.4) in Theorem 4.1) showed that for x t to be I(1) at frequency ω = 0, a set of necessary and sufficient conditions are: In this case x t satisfies (2) for d 1 = 1, d 2 = 0, and ζ(L) = ζ taken to be any row vector in B = row F (β 0 ) with F = R.
Example 2 (I(2) VAR). Following Johansen (1992), consider the same VAR process as in Example 1. Johansen (1992) showed that for x t to be I(2) at frequency ω = 0, a set of necessary and sufficient conditions are: In this case x t satisfies (2) for d 1 = 2, d 2 = 0, and ζ(L) = ζ 0 + ∆ζ 1 taken to be any row vector obtained by linear combinations of the rows in β 0 + (1 − L)ᾱ 0 A 1 and β 1 . The notion of cointegrating space for I(2) processes is discussed in detail below, whereᾱ 0 A 1 is called the 'multicointegrating coefficient'.

Root Functions, Cointegrating Vectors and Canonical Systems
This section introduces root functions and canonical systems of root functions, and their connection to cointegrating vectors, as defined in Definition 2 above.

Root Functions
Let x t ∼ I ω (d) be cointegrated at frequency ω, i.e., see Definition 2, where d := d 1 − d 2 and C(z) has full rank on D(z ω , η), η > 0, except at z = z ω , see Remark 13. The following definition of (left) root functions is taken from Gohberg et al. (1993); this definition is given in a neighborhood of z ω .
The positive integer s is called the order of the root function ϕ(z) at z ω .
Remark 19 (Factoring the difference operator). Definition 3 characterizes roots functions by their ability to factor powers of (z − z ω ) from C(z). Note that, because here z Remark 20 (Local analysis). Note first that C(z) cannot be identically 0 in Definition 3, because C(z) is assumed to be regular. Next take for example the 2 × 2 matrix C(z) = diag((1 − z), (1 + z)) which has full rank on C, except at the two points z 0 = 1 and z π = −1, where it has rank 1. Take first the point at z 0 = 1; in this case one could choose a disk D(1, η) with any η < 2, on which C(z) is analytic and full rank except at z 0 = 1. One can verify that a root function is . The same can be repeated for the other point z π = −1, choosing a different disk D(−1, η) with any η < 2, and a root function equal to (0, 1).
The implication of this example is that one can have multiple separated points where C(z) has reduced rank, and apply the above definition to each point separately, using a different disk D for each point. In other words, the discussion of cointegration in this paper is local to a single unit root.
Remark 21 (Order). A root function factorises (z − z ω ) s from C(z), and s indicates the order. The condition ϕ(z ω ) = 0 guarantees that in the analytic expansion ϕ(z) = ∑ ∞ n=0 (z − z ω ) n ϕ n , the first term ϕ 0 is not the null vector. Note that the condition ϕ(z ω ) = 0 makes sure that one cannot extract additional factors of (z − z ω ) from C(z) using ϕ(z) .
It is immediate to see that a cointegrating vector is a root function of C(z) and vice versa, as stated in the following theorem.

Theorem 1 (Cointegrating vectors and root functions). ζ(z) is a cointegrating vector at
frequency ω if and only if ζ(z) is a root function of C(z) at z ω , and the order of the cointegrating vector and of the root function coincide.
Proof. Observe that any root function satisfies Definition 2 of cointegrating vectors and vice versa, including the definition of their order.
Results in Gohberg et al. (1993) shows that the order of a root functions is finite, because it is bounded by the order of z ω as a zero of det C(z), a result that is reported in the next proposition.
Proposition 2 (Bound on the order of a root function). The order of a root function of C(z) at z ω is at most equal to the order of z ω as a zero of det C(z), which is finite because C(z) is regular.
Corollary 1 (Bound on the order of a cointegrating vector). The order of any cointegrating vector at frequency ω is finite.
Proof. This follows from Proposition 2 because cointegrating vectors and root functions coincide by Theorem 1.

Canonical Systems of Root Functions
Next, canonical systems of root functions for C(z) at z ω are introduced, see Gohberg et al. (1993). Choose a root function φ 1 (z) of highest order s 1 . Since the orders of the root functions are bounded by Proposition 2, such a function exists. Next proceed iteratively over j = 2, . . . , choosing the next root function φ j (z) to be of the highest order s j such that φ j (z ω ) is linearly independent from φ 1 (z ω ) , . . . , φ j−1 (z ω ) . Because m := dim((col C(z ω )) ⊥ ) < ∞, this process ends with m root functions φ 1 (z) , . . . , φ m (z) .
Finally, consider the local Smith factorization of C(z) at z = z ω , see Gohberg et al. (1993), i.e., the factorization where M(z) = diag((z − z ω ) s h ) h=1,...,p is uniquely defined and contains the partial multiplicities s 1 ≥ · · · ≥ s p of C(z) at z = z ω ; the matrices E(z), H(z) are analytic and invertible in a neighbourhood of z = z ω and are non-unique. M(z) is called the local Smith form of C(z) at z = z ω . 6 Remark 22 (Extended canonical system of root functions in the I(1) VAR case). In the I(1) VAR case, see Example 1, the orders of an extended canonical system of root functions of C(z) at 1 are (s 1 , . . . , s r 0 , s r 0 +1 , . . . , s p ) = (1, . . . , 1, 0, . . . , 0) and a possible choice of an extended canonical system of root functions corresponding to these unique orders is given by the p rows in (β 0 , β 1 ) .

Cointegrating Spaces
Let φ(z) be a canonical system of root functions of C(z) at z ω , see Definition 4. Appendix A.2 shows that row G (φ(z) ) with G = F, F[z], F(z) are well defined sets of (generalized) root functions. This section argues that one could take any of them as a definition of 'cointegrating space' for multicointegrated systems. Note that so that the three definitions of cointegrating space are naturally nested. Remark that row F (φ(z) ) is a vector space over F, row F[z] (φ(z) ) is a free module over the ring F[z] of polynomials in z (which contains row F (φ(z) )) and row F(z) (φ(z) ) is a vector space over the field F(z) of rationals functions of z ((which contains row F[z] (φ(z) ) and hence row F (φ(z) )). Finally note the central role played by the canonical system of root functions φ(z) as a basis for these different spaces, which differ for the set of scalars chosen in linear combinations.
5.1. The Cointegrating Space row F (φ(z) ) as a Vector Space over F The cointegrating space row F (φ(z) ), where F = R, C, is a vector space. In fact, the set of all F-linear combination of φ(z) produces a vector space, because row F (φ(z) ) is closed under multiplication by a scalar in F by Proposition A1 and with respect to vector addition, as a special case of Proposition A2.
In order to discuss the cointegrating spaces row F[z] (φ(z) ) and row F(z) (φ(z) ), the notion of generalized cointegrating vector is introduced, as the counterpart of the notion of generalized root function, see Definition A1.
Definition 5 (Generalized cointegrating vector at frequency ω). Let n ∈ Z and ζ(z) be a cointegrating vector at frequency ω and order s, see Definition 2; then is called a generalized cointegrating vector at frequency ω with order s and exponent n.

The Cointegrating Space row F[z] (φ(z) ) as a Free Module over F[z]
Consider next row F[z] (φ(z) ). F[z] is the polynomial ring formed as the set of polynomials in z with coefficients in F. As it is well known, F[z] is a ring but not a field (division ring), see e.g., Hungerford (1980), because polynomials, unlike rational functions, lack the multiplication inverse. The following propositions summarizes that row F[z] (φ(z) ) is a free module over the ring F[z] of polynomials.

Proposition 3 (row F[z] (φ(z) ) is a F[z]-module). Consider
is a canonical system of root functions of C(z) at z ω with coefficients in F, and where F[z] is the ring of polynomials in z with coefficients in F; then G is closed with respect to the vector sum, and it is closed under multiplication by a scalar polynomial in F[z]; hence G is a module over the ring F[z] of polynomials.
Proof. By Propositions A1 and A2, G is closed under addition and under multiplication by a scalar polynomial in F[z]. One needs to verify that, see e.g., Definition IV.1.1 in Hungerford (1980), for ζ(z), ψ(z) ∈ G and 1, a(z) where · indicates multiplication by a scalar. The distributive properties in (13) are seen to be satisfied. This proves the statement.

The Cointegrating Space row F(z) (φ(z) ) as a Vector Space over F(z)
Finally consider row F(z) (φ(z) ). The set of scalars F(z) is the field of rational functions in z with coefficients in F. As it is well known, F(z) is a field (division ring), see e.g., Hungerford (1980).
Remark 24 (Rational vectors without poles at z ω ). Take ζ(L) to be a rational vector, i.e., of the form ζ(z) = 1 d(z) b(z) where d(z) is a monic polynomial and b(z) is a 1 × p vector polynomial, with d(z) and b(z) relatively prime, see Example A1. If d(z) has no root equal to z ω , then ζ(z) is an analytic function on D(z ω , η), η > 0, see Remark A1 and Lemma A1; hence a special case of an analytic vector function ζ(z) is a rational vector with denominator d(z) without roots equal to z ω .
Remark 25 (Rational vectors with poles at z ω ). If d(z) has one root equal to z ω with multiplicity m, then ζ(z) has a pole of order m, and it is not an analytic function on some D(z ω , η), η > 0; hence Definition 2 cannot be applied, because it requires ζ(z) to be analytic. However, one could remove the pole of order m by defining ξ(z) := (1 − z/z ω ) m ζ(z) , and use Definition 2 on ξ(z) , which is analytic function, as done in Definition 5.
Remark 26 (Representation for generic rational vectors). In the following, when dealing with rational vectors of the type ζ(z) = 1 d(z) b(z) , it is sufficient to consider the case where d(z) does not have a root at z ω , thanks to Definition 5. In fact, let d(z) be decomposed as d(z) = (1 − z/z ω ) m d (z) with d (z ω ) = 0 and m ≥ 0; in this representation, z ω is a root of d(z) if and only if m > 0 and it is not a root if and only if m = 0. By Remark 24, ζ(z) is a (generalized) cointegrating vector if and only if ξ(z) : is a cointegrating vector. Hence Definition 5 allows to concentrate on the case where the denominator has no root at z ω .
The following proposition summarizes that row F(z) (φ(z) ) is a vector space over the field F(z) of rational functions.
Proposition 4 (row F(z) (φ(z) ) is a vector space over F(z)). Let H = row F(z) (φ(z) ) where φ(z) is a canonical system of root functions of C(z) at z ω with coefficients in F, where F(z) is the field of rational function in z with coefficients in F; then H is closed with respect to the vector sum, and under multiplication by a scalar rational function in F(z), and H is a vectors space over the field F(z) of rational functions.
Proof. H is closed with respect to multiplication by a rational function in F(z), see Proposition A1, and with respect to vector addition, see Proposition A2. One can ver-ify for ζ(z), ψ(z) ∈ H and 1, a(z), b(z) ∈ F(z), that the distribution equalities in (13) are satisfied. Because F(z) is a field, then H is a vector space over F(z).

The Local Rank Factorization
This section shows how to explicitly obtain a canonical system of root functions φ(z) or an extended canonical system of root functions (φ(z), a ⊥ ) for a generic VAR process with A(z) analytic for all z ∈ D(0, 1 + η), η > 0, having roots at z = z ω = e iω and at z with |z| > 1, see Remarks 1 and 2. The derivation of the Granger representation theorem involves the inversion of the matrix function in D(z ω , η). This includes the case of matrix polynomials A(z), in which the degree of A(z) is finite, k say, with A n = 0 for n > k. 7 The inversion of A(z) around the singular point z = z ω yields an inverse with a pole of some order d = 1, 2, . . . at z = z ω ; an explicit condition on the coefficients {A n } ∞ n=0 in (15) for A(z) −1 to have a pole of given order d is described in Theorem 2 below; this is indicated as the POLE(d) condition in the following. Under the POLE(d) condition, A(z) −1 has Laurent expansion around z = z ω given by Note that C(z ω ) = C 0 = 0 and C(z) is expanded around z = z ω . In the following, the coefficients {C n } ∞ n=0 are called the Laurent coefficients. The first d of them, {C n } d−1 n=0 , make up the principal part and characterize the singularity of A(z) −1 at z = z ω .

Theorem 2 (POLE(d) condition). Consider A(z) defined in
where P x denotes the orthogonal projection onto the space spanned by the columns of x and A h+1,n := Finally, let r j := rank(P a j⊥ A j,1 P b j⊥ ), Then, a necessary and sufficient condition for A(z) to have an inverse with pole of order d = 1, 2, . . . at z = z ω -called POLE(d) condition -is that r j < r max j (reduced rank condition) for j = 0, . . . , d − 1 r d = r max d (full rank condition) for j = d .
Observe that because rank P a j⊥ A j,1 P b j⊥ = rank a j⊥ A j,1 b j⊥ , one has r j = rank a j⊥ A j,1 b j⊥ ; hence d = 1 if and only if r 1 = r max 1 , where r 1 = rank α 0⊥ A 1 β 0⊥ and r max 1 = p − r 0 .
This corresponds to the condition in Howlett (1982, Theorem 3) and to the I(1) condition in Johansen (1991, Theorem 4.1). Similarly, one has d = 2 if and only if r 1 < r max 1 , which corresponds to the I(2) condition in Johansen (1992, Theorem 3). Theorem 2 is thus a generalization of the Johansen's I(1) and I(2) conditions and shows that, in order to have a pole of order d in the inverse, one needs d + 1 rank conditions on A(z): The first j = 0, . . . , d − 1 are reduced rank conditions, r j < r max j , that establish that the order of the pole is greater than j; the last one is a full rank condition, r d = r max d , that establishes that the order of the pole is exactly equal to d. These requirements make up the POLE(d) condition.
and define the p × p matrix functions Γ(z) and Λ(z) as follows Then Γ(z), Ξ(z) := A(z)Γ(z) −1 Λ(z) −1 are analytic and invertible on D(z ω , η), η > 0, and Λ(z) is the local Smith form of A(z) at z ω , A(z) = Ξ(z)Λ(z)Γ(z). Moreover one can choose the factors E(z), M(z), H(z) for the local Smith factorization of C(z) defined in (16), see (12), as Theorem 3 shows that the LRF fully characterizes the elements of the local Smith factorization of C(z) at z ω . In fact, the values of j with r j > 0 in the LRF provide the distinct partial multiplicities of C(z) at z ω and r j gives the number of partial multiplicities that are equal to a given j; this characterizes the local Smith form Λ(z). Moreover, it also provides the constructions of an extended canonical system of root functions.
Remark that the j-th block of rows in where γ j (z ω ) = β j and γ j (z ω ) have full row rank; here γ j (z) denotes the corresponding block of rows in Ξ(z) −1 . This shows that γ j (z) are r j root functions of order d − j of C(z). The next result presents the Triangular representation as proved in Franchi and Paruolo (2019, Corollary 4.6).

Proposition 5 (Triangular representation). Let x t in (14) satisfy the POLE(d) condition on A(z) and define
where γ (20). Then x t is I(d) and it admits the Triangular Representation where no linear combination exists of the l.h.s. that is integrated of lower order.
Observe that the canonical system of root functions φ(z) in (23) is not unique and not of minimal polynomial order, as discussed in the next section. The following example applies the above concepts in the I(2) VAR case.
Example 3 (I(2) VAR example continued). Consider Example 2. Applying truncation to the rows of (β 0 + ∆ᾱ 0 A 1 ), see Propositions 5 and A3, one finds that the columns in β 0 are root functions of C(z) at ω = 0 of order at least min(2, 1) = 1. Consider now one row in (β 0 + ∆ᾱ 0 A 1 + ∆ 2 A ) for some matrix A; this root function is of order 2 by Remark 23, and its truncation to degree 1, i.e., to the corresponding row of (β 0 + ∆ᾱ 0 A 1 ) is still of order 2 by Propositions 5 and A3, Finally consider one row in (β 0 + ∆A ), which gives a root function of order at least 1; its truncation to a polynomial of degree 0 gives the corresponding row of β 0 , which has order at least 1 by Propositions 5 and A3. In fact the rows of β 0 give root functions of order equal to 1 or to 2, when the corresponding entries inᾱ 0 A 1 in (β 0 + ∆ᾱ 0 A 1 ) are equal to 0, as discussed below.

Minimal Bases
This section describes the algorithm of Forney (1975) to reduce the basis φ(z) to minimal order, using the generic notation of b(z) in place of φ(z) . The generic basis b(z) is assumed to be rational and of dimension r × p. This algorithm exploits the nesting row F (b(z) ) ⊂ row F[z] (b(z) ) ⊂ row F(z) (b(z) ). In the following, the j-row of b(z) is indicated as b j (z) , which is the j-th element of the basis, j = 1, . . . , r. Various modifications of the original basis b (0) (z) := b(z) are indicated as b (h) (z) for h = 1, 2, 3.
Definition 6 (Degree of b(z) ). If b(z) is a polynomial basis, the degree v j of its j-th row, indicated as v j := deg b j (z) , is defined as the maximum degree of its elements, and the degree v of b(z) is defined as v := deg b(z) := ∑ r j=1 v j , i.e., the sum of the degrees of its rows.
The reduction algorithm proposed by Forney (1975, pp. 497-98) consists of the following 3 steps.
Step 1 If b (0) (z) is not polynomial, multiply each row by the least common denominator of each row to obtain a polynomial basis b (1) (z) .
Step 2 Reduce row orders in b (1) (z) by taking F[z]-linear combinations.
Step 3 Reduce b (2) (z) to a basis b (3) (z) with a full-row-rank high order coefficient matrix, i.e., a "row proper" basis.
This procedure gives a final basis b (3) (z) which has lowest degree, see Forney (1975) Section 3.
Step 1 is rational, and its j-th row b j (z) has representation b j (z) = 1 a j (z) c j (z) , where c j (z) is a polynomial row vector and a j (z) is a scalar polynomial, and c j (z) and a j (z) are relatively prime. The first step consist in computing b (1) (z) = diag(a 1 (z), . . . , a r (z))b (0) (z) , where Q(z) := diag(a 1 (z), . . . , a r (z)) is a square polynomial matrix of dimension r.

Step 2
The second step reduces the degree of the rows in b (1) (z) . This involves finding specific points z h , h = 1, . . . , k, at which rank(b (1) (z h ) ) < r. To find them, one can calculate the greatest common divisor (z) of all r × r minors of b (1) (z) . If (z) = 1 this step is complete, and the algorithm sets b (2) (z) = b (1) (z) ; otherwise one computes the zeros of (z), z 1 , . . . , z k say, where z h ∈ C, h = 1, . . . , k. The following substep is then applied to each root z h sequentially, h = 1, . . . , k.
Denote by w(z) the current basis; this will be replaced by κ(z) at the end of this substep. For h = 1, one has w(z) = b (1) (z) . For z = z h , all minors of order r of w(z h ) vanish, which means that w(z h ) is singular, i.e., it has reduced rank and rank factorization w(z h ) = ψa , say, where ψ, a are full column rank. Let c := (c 1 , . . . , c p ) be one row in ψ ⊥ . Indicate by A c := {i : c i = 0} the set of its non-zero coefficients, and let v i 0 := max i∈A c {v i } be the maximal degree of rows in w(z) with nonzero coefficient in c .
This substep consists of replacing row i 0 of w(z) with c w(z) /(z − z h ), which is still a polynomial vector. In fact c w(z h ) = 0 , so that c w(z h ) has representation c w(z h ) = (z − z h )τ(z) with τ(z) a polynomial vector, so that c w(z) /(z − z h ) = τ(z) . This defines κ(z) in terms of w(z) as κ(z) = B(z) −1 Qw(z) where Q is an r × r square matrix, equal to I r except for row i 0 , equal to c , and where B(z) is a diagonal matrix equal to I r except for having z − z h in its i 0 -th position on the diagonal. Note that Q is nonsingular, because c i 0 = 0. The same procedure is applied to each row c of ψ ⊥ . This substep is repeated for all z j , j = 1, . . . , k. The condition on the minors in then recalculated and the substep repeated for the new roots, until the greatest common divisor (z) of all r × r minors of κ(z) is 1. When this is the case, Step 2 sets b (2) (z) = κ(z) .

Step 3
The last step operates on the high order coefficient matrix, repeating the following substep. Let w(z) indicate b (2) (z) at the beginning of the substep, which will be replaced by κ(z) at the end of it. Let v i be the order of the i-th row of w(z) , indicated as w i (z) = ∑ v i j=0 (z − z ω ) j w ij . The high-order matrix is defined as the r × p matrix w * := (w 1v 1 , . . . , w rv r ) composed of the coefficient matrix of the highest degree of (z − z ω ) for each row of w(z) .
A necessary and sufficient condition for w * to be of full rank is that the order of w(z) is equal to the maximum order of its r × r minors. If this is not the case, w * is singular, i.e., it has rank factorization w * = ψa with ψ and a of full column rank. Hence one can choose a vector c := (c 1 , . . . , c p ) as one row in ψ ⊥ for which one has c w * = 0 .
As before, let A c := {i : c i = 0} and define v i 0 := max i∈A c {v i }. Let also n i := v i 0 − v i , note that n i ≥ 0 for i ∈ A c and let Q(z) := diag((z − z ω ) n 1 , . . . (z − z ω ) n r ). Row i 0 in w(z) is replaced by where s in the last expression in the first line is defined as j + n i and q j := ∑ i∈A c c i w i,n i +j . The central expression in (24) shows that q(z) is polynomial because n i ≥ 0 in the exponents of (z − z ω ). In order to see that the degree of q(z) is also lower than v i 0 , one can note that the the high order coefficient in (25), which correspond to s = v i 0 in (24), equals ∑ i∈A c c i w i,v i = c w * = 0 . This implies that the order of q(z) is lower than v i 0 , and that replacing row i 0 of w(z) with q(z) reduces the order of the vector.
This defines κ(z) in terms of w(z) as κ(z) = NQ(z)w(z) where N is an r × r square matrix, equal to I r except for row i 0 , equal to c . Note that N is nonsingular, because c i 0 = 0. This process is repeated for all the rows c in ψ ⊥ . Next set w(z) = κ(z) and repeat until the high order coefficient matrix has full rank. When this is the case, Step 3 sets b (3) (z) = κ(z) .

From a Canonical System of Root Functions to a Minimal Basis for I(2) VAR
This section applies the algorithm of Forney reviewed in Section 7 to φ(z) in (23) to reduce the basis to minimal order in the I(2) VAR example at frequency ω = 0. This application leads to the separation of the cases of (i) non-polynomial cointegrating relations reducing the order of integration from 2 to 0; (ii) polynomial cointegrating relations reducing the order of integration from 2 to 0.
The process of obtaining minimal bases does not lead to a unique choice of basis; this leaves open the choice of how to further restrict the basis to obtain uniqueness. Forney (1975) obtains uniqueness requiring the minimal basis to be in upper echelon form. Other sets of restrictions can also be considered. For the sake of brevity, the restrictions on how to obtain a unique minimum basis are not further discussed here.

Step 1 in I(2) VAR
Consider the triangular representation of an I(2) system, see (23): and apply the algorithm of Forney (1975)

Next consider
Step 2, and set w(z) = b (1) (z) . One wishes to find some zero z h and some corresponding c so as have c w(z h ) = 0 . Denoting u = z h − 1, one hence needs to find the pair (u, c ) such that where u is a scalar. Note that u = 0 is not a possible zero of (27), because w(z ω ) = (β 0 , β 1 ) is of full row rank, so that u = 0. Post-multiplying (27) by the square non-singular matrix (β 0 ,β 1 ,β 2 ) one finds c I r 0 0 0 0 I r 1 0 Hence, partitioning c as c = (ς , θ ) where ς is 1 × r 0 , one finds that the second set of equations gives θ = 0 and the first one, substituting the expression of γ 0,1 = −ᾱ 0 A 1 given in Theorem 3, implies where λ = u −1 = 0 in (29); note also that u = 0 has been simplified in (30). This proves the following proposition.
Proposition 6 (Step 2 condition in I(2)). A necessary and sufficient condition for Step 2 to be non-empty is that (29), (30) hold simultaneously, i.e., that (λ, ς ) is a non-zero eigenvalue-left eigenvector pair ofᾱ 0 A 1β0 , and the left eigenvector v is orthogonal toᾱ 0 A 1 (β 1 ,β 2 ). If this is the case, for each pair (λ, ς ) one has Observe that from (27), using c = (ς , θ ) and where the last equality follows from (31). This shows that under the necessary and sufficient condition in Proposition 6, there is a linear combination c of w (z) where one can factor z − z h out of c w (z), which reduces the order from 1 to 0. Here c w (z), which has degree equal to 1, is replaced by c w (z)/(z − z h ) = −ς ᾱ 0 A 1 = −λς β 0 , which has degree 0. Note that from (31) the new cointegrating relation is in the span of β 0 . This can be done for all pairs (λ, ς ). Let (λ j , ς j ) be all the pairs (λ, ς ) satisfying the assumptions of Proposition 6, j = 1, . . . , s, and let q := (λ 1 ς 1 , . . . , λ k ς s ) . Choose also a as some matrix (r − s) × r matrix such that (q, a) is square and nonsingular; many matrices satisfy this criterion, including q ⊥ . The output of Step 2 can be expressed as the following Remark 28 (CI(2,2) cointegration). This step brings out from φ(z) some cointegrating relations q β 0 that map the I(2) variables directly to I(0) without the help of first differences ∆.

Step 3 in I(2) VAR
Consider b (2) (z) in (33) and its high order coefficient matrix Step 3 requires to find a nonzero matrix c such that c w * = 0 . Recall that (β 0 ,β 1 ,β 2 ) is square and nonsingular; hence c w * = 0 if and only if, partitioning c as c = (ζ , ρ , τ ) one has This equality can be written as Remark 29 (Further degree reductions). Equation (35) requires ζ to be orthogonal to remaining part of the multicointegrating coefficient a ᾱ 0 A 1 in direction of β 2 . In addition ζ also needs to satisfy (34). For some configurations of dimensions, (34) could be solvable for (ρ , τ ) in terms of other quantities; in this case (34) would not impose further restrictions.
Let also ϑ be any complementary matrix such that (ζ, ϑ) is square and nonsingular; one possible choice of ϑ is ζ ⊥ . The output of Step 3 can be expressed as the following choice Remark 30 (Minimal basis). This step brings out from φ(z) some other cointegrating relations ζ a β 0 that map the I(2) variables directly to I(0) without the help of first differences ∆. Equation (36) shows how the canonical system of root functions can be reduced to minimal order.
Example 4 (Multicointegration coefficient in the span of β 2 ). Consider the special case when the multicointegrating coefficientᾱ 0 A 1 satisfiesᾱ 0 A 1 =ᾱ 0 A 1 P β 2 , i.e., it has components only in the direction of β 2 . This special case is relevant, because β 2 ∆x t ∼ I(1) while β i ∆x t ∼ I(d) with d ≤ 0 for i = 0, 1.
One can see that in this case the conditions in Proposition 6 are not satisfied. In fact (29) cannot hold, asᾱ 0 A 1β0 = 0.
Step 2 is hence empty, and this implies that the rows including q are missing and a = I in b (2) (z) in (33) and (36). Applying Step 3, Equation (34) is always satisfied by the choice ρ = 0 , τ = 0 becausē α 0 A 1 (β 0 ,β 1 ) = 0. Equation (35) then reads ζ ᾱ 0 A 1β2 = 0, which is satisfied if and only if δ :=ᾱ 0 A 1β2 has reduced rank. In this case, let the rank factorization be δ = ψη , with ψ and η of full column rank. One can then let ζ = ψ ⊥ and choose ϑ =ψ , so that There are several examples of this separation in the I(2) VAR literature; for example Kongsted (2005) discusses this when r 0 > r 2 .

Conclusions
This paper discusses the notion of cointegrating space for general I(d) processes. The notion of cointegrating space was formally introduced in the literature by Johansen (1988) for the case of I(1) VAR system. The definition of the cointegrating space is simplest in the I(1) case without multicointegration, because there is no need to consider vector polynomials in the lag operator. Engle and Yoo (1991) introduced the notion of polynomial cointegrating vectors in parallel with the related one of multicointegration in Granger and Lee (1989). However, the literature has not yet discussed the notion of cointegrating space in the general polynomial case; this paper fills this gap.
In this context, this paper recognises that cointegrating vectors are in general root functions, which have been analysed at length in the mathematical and engineering literature, see e.g., Gohberg et al. (1993). This allows to characterise a number of properties of cointegrating vectors.
Canonical systems of root functions are found to provide a basis of several notions of cointegration space in the multicointegrated case. The extended local rank factorization of Franchi and Paruolo (2016) can be used to explicitly derive a canonical system of root functions. This result is constructive, as it gives an explicit way to derive such a basis from the VAR polynomial.
The canonical system of root functions constructed in this way is not necessarily of minimal polynomial degree, however. The three-step procedure of Forney (1975) to reduce this basis to minimal-degree is reviewed and restated in terms of rank factorizations. The application of this procedure to I(2) VAR systems is shown to separate the polynomial and the non-polynomial cointegrating vectors.
(iii) if a(z) ∈ F(z), a(z) = 0, then a(z)ζ(z) is a generalized root function on D(z ω , η) of order s and exponent n ∈ Z.
(ii). Set a 2 (z) = 1 in the proof of iii), and note that the exponent is n 1 , which is either 0 or positive.
(i). Set a 1 (z) = a, a 2 (z) = 1 in the proof of iii), and note that the exponent is n 1 = 0.

Remark A3 (A generalized root function is meromorphic). A generalized root function ξ(z)
is analytic on D(z ω , η) except for the possibility to have poles at the isolated point z ω , i.e., it is a meromorphic function on D(z ω , η).
Remark A4 (A generalized root function can be analytic). When the exponent n of ξ(z) is zero, the generalized root function ξ(z) coincides with the root function ζ(z) . When the exponent n of ξ(z) is positive, then the generalized root function ξ(z) has a zero at z ω . In both cases ξ(z) is analytic. So a generalized root function can be analytic (with or without a zero at z ω ).
Remark A5 (Generalized root function and cointegration). Observe that Definition A1 implies the following: given a meromorphic function ξ(z) , check if it has a root or a pole at z ω ; this function is a generalized root functions if, after removing the pole or the zero at z ω by multiplying it by (1 − z/z ω ) −n where n is the order of the root or of the pole, the resulting function is a root function, i.e., a cointegrating vector. This is in line with Definition 5.
Attention is now turned to linear combinations of a canonical system of root functions φ(z) . The scalars of the linear combination can be in F, F[z] or F(z). The main result in Proposition A2 below is that F[z]-linear combinations of φ(z) generate a generalized root function possibly with a zero at z ω , while F(z)-linear combinations of φ(z) generate a generalized root function possibly with a pole or a zero at z ω .
In the following, let v = (v 1 , . . . , v m ) ∈ F m be a 1 × m vector with elements in F. Let also A v be the set of non-zero entries in v, A v := {i : v i = 0}, with n v the cardinality of A v and (i 1 , . . . , i n v ) the ordered set of indices in A v , i 1 < · · · < i n v , i j ∈ A v . Similarly, let w(z) = (w 1 (z), . . . , w m (z)) ∈ F[z] m be a 1 × m vector with polynomial elements in w i (z) ∈ F[z] with (j 1 , . . . , j n w ) its ordered set of indices of nonzero elements in A w := {i : w i (z) = 0}, and let finally u(z) = (u 1 (z), . . . , u m (z)) ∈ F(z) m be a 1 × m vector with rational elements in u i (z) ∈ F(z) with (k 1 , . . . , k n u ) as its ordered set of indices of nonzero elements in A u := {i : u i (z) = 0}. Proposition A2 (Linear combinations). Let φ(z) = (φ 1 (z), . . . φ m (z)) be a canonical system of root functions of C(z) on a disc D(z ω , η), η > 0 with orders s 1 , . . . , s m ; let also v ∈ F m , w(z) ∈ F[z] m and u(z) ∈ F(x) m be nonzero vectors; one has: is a root function of order s = min i∈A v s i ; (ii) w(z) φ(z) = ∑ m j=1 w j (z)φ j (z) is a generalized root function, with exponent q = min j∈A w (q j ) ≥ 0 where q j is the order of z ω as a zero of w j (z), and with order s := min j∈A w (q j − q + s j ) > 0; (iii) u(z) φ(z) = ∑ m k=1 u k (z)φ k (z) is a generalized root function, possibly with a pole or a zero at z ω , with exponent q = min k∈A u (q k ) ∈ Z where q k is the order of z ω as a pole or as a zero of u k (z), and with order s = min k∈A u (q k − q + s k ) > 0.
F is a field, and hence it is closed under multiplication. Hence v φ(z) is a polynomial with coefficients vectors in F p , of the same form as each φ i (z) , and one finds that where s := min{s i 1 , . . . , Note that because v is a nonzero vector, the set A v is not empty. Next observe that φ(z ω ) = 0 otherwise this would contradict the property of φ i h (z) to be of maximal order and linearly independent from the previous root function φ i (z) for i < i h . This shows that v φ(z) is a root function of order s.
where by Proposition A1. (i), one has that w i (z)φ i (z) is a generalized root function with representation w i (z)φ i (z) = (1 − z/z ω ) q i · ·w i (z)φ i (z) say, with q i ≥ 0 and w i (z)φ i (z) a root function of order s i . Let q := min(q j 1 , . . . , q j w ), and note that w(z) φ(z) = (1 − z/z ω ) q ζ(z) with ζ(z) := ∑ n w h=1 (1 − z/z ω ) q j h −q w j h (z)φ j h (z) . In order to show that ζ(z ω ) = 0 , let B w be the set of indices j ∈ A w with q j = q, and observe that ζ(z ω ) = ∑ j∈B w w j (z ω )φ j (z ω ) where w j (z ω ) = 0 by construction and φ j (z ω ) = 0 by the definition of root function. If ζ(z ω ) = 0 this would imply that there is a nonzero linear combination of φ(z ω ) equal to 0', i.e., that φ(z ω ) is not of full row rank, which contradicts the construction in Definition 4. This implies that ζ(z ω ) = 0 , and that w(z) φ(z) is a generalized root function of order q.
Next, because φ j (z) is a root function of order s j one has ζ(z) C(z) = n w ∑ h=1 (1 − z/z ω ) q j h −q w j h (z)φ j h (z) C(z) = n w ∑ h=1 (1 − z/z ω ) q j h −q+s j h w j h (z) φ j h (z) = (1 − z/z ω ) s ζ(z) where ζ(z) := ∑ n w h=1 (1 − z/z ω ) q j h −q+s j h −s w j h (z) φ j h (z) . Finally, in order to prove that the order of the generalized root function is s, one needs to show that ζ(z ω ) = 0 . Let C w be the set of indices j ∈ A w with q j h − q + s j h = s, and observe that ζ(z ω ) := ∑ j∈C w w j (z ω ) φ j (z ω ) where w j (z ω ) = 0 and φ j (z ω ) = 0 as above. If ζ(z ω ) = 0 , then there exists a nonzero linear combination of φ(z ω ) equal to 0 , which would imply the existence of a root function of higher order obtained by combination of the rows in φ(z) with index C w , which contradict the fact that the orders are chosen to be maximal. This implies that the order of the generalized root function is equal to s. (iii). The proof is the same as in ii). Note that here q i may be negative.
Remark A6 (Closure with respect to linear combinations). Proposition A2 shows that F[z]linear combinations and F(z)-linear combinations of a canonical systems of root functions φ(z) produce generalized root functions. Note that φ(z) is itself a set of generalized root functions (with 0 exponent). Hence, in this sense, generalized root functions are closed under F[z]-linear combinations and F(z)-linear combinations.
Remark A9 (Truncated cointegrating vectors). Proposition A3. (ii) implies that truncating a cointegrating vector to order < s preserves the cointegrating property, but not necessarily the order s.
Remark A10 (Cointegrating vectors in I ω (1) VAR can be chosen not to be polynomial). Consider Example 1, where the orders of integration of (polynomial) linear combinations can be either 1 or 0. In this case, root function are of order at most s = 1, and Proposition A3. (iii) ensures that the root functions can be truncated to order s − 1 = 0, i.e., to non-polynomial linear combinations.
Remark A11 (A generic I ω (1) process may have polynomial cointegration relations). Consider now the generic case of an I(1) process. The orders of integration of (polynomial) linear combinations can be 0, −1, −2, · · · − d say, with d > 0. In this case, root function are of order at most s = 1, 2 . . . , d + 1, and Proposition A3. (iii) ensures that the root functions can be truncated to order d. If d > 0 this may require polynomial linear combinations also in the I ω (1) case.
Remark A12 (Polynomial cointegration vectors in I ω (2) VAR can be chosen of order at most 1). Consider Example 2, where the orders of integration of (polynomial) linear combinations can be either 2, 1 or 0. In this case, root function are of order at most s = 2, and Proposition A3. (iii) ensures that the root functions can be truncated to order s − 1 = 1, i.e., to polynomial linear combinations of order 1.
Remark A13 (Multicointegrated systems require polynomial cointegration relations). As shown in the previous three remarks, in general, multicointegrated systems require to consider polynomial linear combinations.
Notes 1 See Engle and Granger (1987, pp. 253-54). Here N in their notation is replaced by p and α with β for consistency with the rest of the paper. 2 The following notation is employed: F = R, C indicates either the field of real numbers R or the field of complex numbers C and if a matrix A = (a 1 , . . . , a n ) is written in terms of its columns, col F (A) indicates the column span of A with coefficients in F, i.e., col F (A) := {v : v = ∑ n j=1 c i a i , c i ∈ F} and row F (A ) denotes the row span of A with coefficients in F, i.e., row F (A ) := {v : v = ∑ n j=1 c i a i , c i ∈ F}, where A indicates the conjugate transpose of A. Hence v ∈ col F (A) if and only if v ∈ row F (A ), i.e., the spaces coincide but the former contains column vectors while the latter contains row vectors. Here the row form is employed. 3 ε t could be taken to be non-autocorrelated instead of i.i.d. with no major changes in the results in the paper. 4 This result is usually stated as x t = u t − a 0 where a 0 := x 0 − u 0 is a generic constant, see e.g., Hannan and Deistler (1988) Equation (1.2.15). 5 In fact, substituting C(z) = C + C(z)(1 − z), one finds ζ ∆x t = ζ Cε t + ζ C(L)∆ε t , and applying S to both sides gives ζ x t − ζ x 0 = ζ CS ε t + u t − u 0 where u t := ζ C(L)ε t is stationary. The term Sε t is a bilateral random walk (Franchi and Paruolo 2019), a nonstationary process, so that the l.h.s. can be made stationary if and only if the coefficient ζ C loading Sε t is 0. 6 Theorem 3 provides two constructions of the local Smith factorization. 7 In this case A(z) is analytic for all z ∈ C.