Abstract
We develop and discuss a parameterization of vector autoregressive moving average processes with arbitrary unit roots and (co)integration orders. The detailed analysis of the topological properties of the parameterization—based on the state space canonical form of Bauer and Wagner (2012)—is an essential input for establishing statistical and numerical properties of pseudo maximum likelihood estimators as well as, e.g., pseudo likelihood ratio tests based on them. The general results are exemplified in detail for the empirically most relevant cases, the (multiple frequency or seasonal) I(1) and the I(2) case. For these two cases we also discuss the modeling of deterministic components in detail.
1. Introduction
Since the seminal contribution of Clive W.J. Granger (1981) that introduced the concept of cointegration, the modeling of multivariate (economic) time series with models and methods that allow for unit roots and cointegration has become standard econometric practice with applications ranging from macroeconomics to finance to climate science.
The most prominent (parametric) model class for cointegration analysis are vector autoregressive (VAR) models, popularized by the important contributions of Søren Johansen and Katarina Juselius and their co-authors, see, e.g., the monographs Johansen (1995) and Juselius (2006). The popularity of VAR cointegration analysis stems not only from the (relative) simplicity of the model class, but also from the fact that the VAR cointegration literature is very well-developed and provides a large battery of tools for diagnostic testing, impulse response analysis, forecast error variance decompositions and the like. All this makes VAR cointegration analysis to a certain extent the benchmark in the literature.1
The imposition of specific cointegration properties on an estimated VAR model becomes increasingly complicated as one moves away from the I(1) case. As discussed in Section 2, e.g., in the I(2) case a triple of indices needs to be chosen (fixed or determined via testing) to describe the cointegration properties. The imposition of cointegration properties in the estimation algorithm then leads to “switching” type algorithms that come together with non-trivial parameterization restrictions involving non-linear inter-relations, compare Paruolo (1996) or Paruolo (2000).2 Mathematically, these complications arise from the fact that the unit root and cointegration properties are in the VAR setting related to rank restrictions on the autoregressive polynomial matrix and its derivatives.
Restricting cointegration analysis to VAR processes may be too restrictive. First, it is well-known since Zellner and Palm (1974) that VAR processes are not invariant with respect to marginalization, i.e., subsets of the variables of a VAR process are in general vector autoregressive moving average (VARMA) processes. Second, similar to the first argument, aggregation of VAR processes also leads to VARMA processes, an issue relevant, e.g., in the context of temporal aggregation and in mixed-frequency settings. Third, the linearized solutions to dynamic stochastic general equilibrium (DSGE) models are typically VARMA rather than VAR processes, see, e.g., Campbell (1994). Fourth, a VARMA model may be a more parsimonious description of the data generating process (DGP) than a VAR model, with parsimony becoming more important with increasing dimension of the process.3
If one accepts the above arguments as a motivation for considering VARMA processes in cointegration analysis, it is convenient to move to the—essentially equivalent (see Hannan and Deistler 1988, chps. 1 and 2)—state space framework. A key challenge when moving from VAR to VARMA models—or state space models—is that identification becomes an important issue for the latter model class, whereas unrestricted VAR models are (reduced-form) identified. In other words, there are so-called equivalence classes of VARMA models that lead to the same dynamic behavior of the observed process. As is well-known, to achieve identification, restrictions have to be placed on the coefficient matrices in the VARMA case, e.g., zero or exclusion restrictions. A mapping attaching to every transfer function, i.e, the function relating the error sequence to the observed process, a unique VARMA (or state space) system from the corresponding class of observationally equivalent systems is called canonical form. Since not all entries of the coefficient matrices in canonical form are free parameters, for statistical analysis a so-called parameterization is required that maps the free parameters from coefficient matrices in canonical form into a parameter vector. These issues, including the importance of the properties such as continuity and differentiability of parameterizations, are discussed in detail in Hannan and Deistler (1988, chp. 2) and, of course, are also relevant for our setting in this paper.
The convenience of the state space framework for unit root and cointegration analysis stems from the fact that (static and dynamic) cointegration can be characterized by orthogonality constraints, see Bauer and Wagner (2012), once an appropriate basis for the state vector, which is a (potentially singular) VAR process of order one, is chosen. The integration properties are governed by the eigenvalue structure of unit modulus eigenvalues of the system matrix in the state equation. Eigenvalues of unit modulus and orthogonality constraints arguably are easier restrictions to deal with or to implement than the interrelated rank restrictions considered in the VAR or VARMA setting. The canonical form of Bauer and Wagner (2012) is designed for cointegration analysis by using a basis of the state vector that puts the unit root and cointegration properties to the center and forefront. Consequently, these results are key input for the present paper and are thus briefly reviewed in Section 3.
An important problem with respect to appropriately defining the “free parameters” in VARMA models is the fact that no continuous parameterization of all VARMA or state space models of a certain order n exists in the multivariate case (see Hazewinkel and Kalman 1976). This implies that the model set, say, has to be partitioned into subsets on which continuous parameterizations exist, i.e., for some multi-index varying in an index set G. Based on the canonical form of Bauer and Wagner (2012), the partitioning is according to systems—in addition to other restrictions such as fixed order n—with fixed unit root properties, to be precise over systems with given state space unit root structure. This has the advantage that, e.g., pseudo maximum likelihood (PML) estimation can straightforwardly be performed over systems with fixed unit root properties without any further ado, i.e., without having to consider (or ignore) rank restrictions on polynomial matrices. The definition and detailed discussion of the properties of this parameterization is the first main result of the paper.
The second main set of results, provided in Section 4, is a detailed discussion of the relationships between the different subsets of models for different indices and the parameterization of the respective model sets. Knowledge concerning these relations is important to understand the asymptotic behavior of PML estimators and pseudo likelihood ratio tests based on them. In particular, the structure of the closures of M, say, of the considered model set M has to be understood, since the difference cannot be avoided when maximizing the pseudo likelihood function4. Additionally, the inclusion properties between different sets need to be understood, as this knowledge is important for developing hypothesis tests, in particular for developing hypothesis tests for the dimensions of cointegrating spaces. Hypotheses testing, with a focus on the MFI(1) and I(2) cases, is discussed in Section 5, which shows how the parameterization results of the paper can be used to formulate a large number of hypotheses on (static and polynomial) cointegrating relationships as considered in the VAR cointegration literature. This discussion also includes commonly used deterministic components such as intercept, seasonal dummies, and linear trend, as well as restrictions on these components.
The paper is organized as follows: Section 2 briefly reviews VAR and VARMA models with unit roots and cointegration and discusses some of the complications arising in the VARMA case in addition to the complications arising due to the presence of unit roots and cointegration already in the VAR case. Section 3 presents the canonical form and the parameterization based on it, with the discussion starting with the multiple frequency I(1)—MFI(1)—and I(2) cases prior to a discussion of the general case. This section also provides several important definitions like, e.g., of the state space unit root structure. Section 4 contains a detailed discussion concerning the topological structure of the model sets and Section 5 discusses testing of a large number of hypotheses on the cointegrating spaces commonly tested in the cointegration literature. The discussion in Section 5 focuses on the empirically most relevant MFI(1) and I(2) cases and includes the usual deterministic components considered in the literature. Section 6 briefly summarizes and concludes the paper. All proofs are relegated to the Appendix A and Appendix B.
Throughout we use the following notation: L denotes the lag operator, i.e., , for brevity written as . For a matrix , denotes its conjugate transpose. For with full column rank , we define of full column rank such that . denotes the p-dimensional identity matrix, the m times n zero matrix. For two matrices , denotes the Kronecker product of A and B. For a complex valued quantity x, denotes its real part, its imaginary part and its complex conjugate. For a set V, denotes its closure.5 For two sets V and W, denotes the difference of V and W, i.e., . For a square matrix A we denote the spectral radius (i.e., the maximum of the moduli of its eigenvalues) by and by its determinant.
2. Vector Autoregressive, Vector Autoregressive Moving Average Processes and Parameterizations
In this paper, we define VAR processes , , as solution of
with , where for , , , a white noise process , , with and a vector sequence , , comprising deterministic components like, e.g., the intercept, seasonal dummies or a linear trend. Furthermore, we impose the non-explosiveness condition for all , with and z denoting a complex variable.6
Thus, for given autoregressive order p, with—as defining characteristic of the order—, the considered class of VAR models with specified deterministic components is given by the set of all polynomial matrices such that (i) the non-explosiveness condition holds, (ii) and (iii) ; together with the set of all matrices .
Equivalently, the model class can be characterized by a set of rational matrix functions , referred to as transfer functions, and the input-output description for the deterministic variables, i.e.,
The associated parameter space is , where the parameters
are obtained from stacking the entries of the matrices and , respectively.
Remark 1.
In the above discussion the parameters, say, describing the variance covariance matrix Σ of are not considered. These can be easily included, similarly to Φ by, e.g., parameterizing positive definite symmetric matrices via their lower triangular Cholesky factor. This leads to a parameter space . We omit for brevity, since typically no cross-parameter restrictions involving parameters corresponding to Σ are considered, whereas as discussed in Section 5 parameter restrictions involving—in this paper in the state space rather than the VAR setting—both elements of and Φ, to, e.g., impose the absence of a linear trend in the cointegrating space, are commonly considered in the cointegration literature.7 The estimator of the variance covariance matrix Σ often equals the sample variance of suitable residuals from (1), if there are no cross-restrictions between θ and . This holds, e.g., for the Gaussian pseudo maximum likelihood estimator. Thus, explicitly including and in the discussion would only overload notation without adding any additional insights, given the simple nature of the parameterization of Σ.
Remark 2.
Our consideration of deterministic components is a special case of including exogenous variables. We include exogenous deterministic variables with a static input-output behavior governed solely by the matrix Φ. More general exogenous variables that are dynamically related to the output could be considered, thereby considering so-called VARX models rather than VAR models, which would necessitate considering in addition to the transfer function also a transfer function , say, linking the exogenous variables dynamically to the output.
For the VAR case, the fact that the mapping assigning a given transfer function , to a parameter vector —the parameterization—is continuous with continuously differentiable inverse is immediate.8 Homeomorphicity of a parameterization is important for the properties of parameter estimators, e.g., the ordinary least squares (OLS) or Gaussian PML estimator, compare the discussion in Hannan and Deistler (1988, Theorem 2.5.3 and Remark 1, p. 65).
For OLS estimation one typically considers the larger set without the non-explosiveness condition and without the assumption :
Considering allows for unconstrained optimization. It is well-known that for as given above, the OLS estimator is consistent over the larger set , i.e., without imposing non-explosiveness and also when specifying p too high. Alternatively, and closely related to OLS in the VAR case, the pseudo likelihood can be maximized over . With this approach, maxima respectively suprema can occur at the boundary of the parameter space, i.e., maximization effectively has to consider . It is well-known that the PML estimator is consistent for the stable case (cf. Hannan and Deistler 1988, Theorem 4.2.1), but the maximization problem is complicated by the restrictions on the parameter space stemming from the non-explosiveness condition. Avoiding these complications and asymptotic equivalence of OLS and PML in the stable VAR case explains why VAR models are usually estimated by OLS.9
To be more explicit, ignore deterministic components for a moment and consider the case where the DGP is a stationary VAR process, i.e., a solution of (1) with satisfying the stability condition for . Define the corresponding set of stable transfer functions by :
Clearly, is an open subset of . If the DGP is a stationary VAR process, the above-mentioned consistency result of the OLS estimator over implies that the probability that the estimated transfer function, say, is contained in converges to one as the sample size tends to infinity. Moreover, the asymptotic distribution of the estimated parameters is normal, under appropriate assumptions on .
The situation is a bit more involved if the transfer function of the DGP corresponds to a point in the set , which contains systems with unit roots, i.e., determinantal roots of on the unit circle, as well as lower order autoregressive systems—with these two cases non-disjoint. The stable lower order case is relatively unproblematic from a statistical perspective. If, e.g., OLS estimation is performed over , while the true model corresponds to an element in , with , the OLS estimator is still consistent, since . Furthermore, standard chi-squared pseudo likelihood ratio test based inference still applies. The integrated case, for a precise definition see the discussion below Definition 1, is a bit more difficult to deal with, as in this case not all parameters are asymptotically normally distributed and nuisance parameters may be present. Consequently, parameterizations that do not take the specific nature of unit root processes into account are not very useful for inference in the unit root case, see, e.g., Sims et al. (1990, Theorem 1). Studying the unit root and cointegration properties is facilitated by resorting to suitable parameterizations that “zoom in on the relevant characteristics”.
In case that the only determinantal root of on the unit circle is at , the system corresponds to a so-called process, with the integration order made precise in Definition 1 below. Consider first the I(1) case: As is well-known, the rank of the matrix equals the dimension of the cointegrating space given in Definition 3 below—also referred to as the cointegrating rank. Therefore, determination of the rank of this matrix is of key importance. With the parameterization used so far, imposing a certain (maximal) rank on implies complicated restrictions on the matrices . This in turn renders the correspondingly restricted optimization unnecessarily complicated and not conducive to develop tests for the cointegrating rank. It is more convenient to consider the so-called vector error correction model (VECM) representation of autoregressive processes, discussed in full detail in the monograph Johansen (1995). To this end let us first introduce the differencing operator at frequency
For notational brevity, we omit the dependence on L in , henceforth denoted as . Using this notation, the I(1) error correction representation is given by
with the matrix of rank factorized into the product of two full rank matrices and , .
This constitutes a reparameterization, where is now represented by the matrices and a corresponding parameter vector . Please note that stacking the entries of the matrices does not lead to a homeomorphic mapping from to , since for the matrices and are not identifiable from the product , since for all regular matrices . One way to obtain identifiability is to introduce the restriction , with and . With this additional restriction the parameter vector is given by stacking the vectorized matrices , similarly to (2). Then . Note for completeness that the normalization of may necessitate a re-ordering of the variables in since—without potential reordering—this parameterization implies a restriction of generality as, e.g., processes, where the first variable is integrated, but does not cointegrate with the other variables, cannot be represented.
Define the following sets of transfer functions:
The dimension of the parameter vector depends on the dimension of the cointegrating space, thus the parameterization of depends on r. The so-called reduced rank regression (RRR) estimator, given by the maximizer of the pseudo likelihood over is consistent, see, e.g., Johansen (1995, chp. 6). The RRR estimator uses an “implicit” normalization of and thereby implicitly addresses the mentioned identification problem. However, for testing hypotheses involving the free parameters in or , typically the identifying assumption given above is used, as discussed in Johansen (1995, chp. 7).
Furthermore, since for , with a lower dimensional subset of , pseudo likelihood ratio testing can be used to sequentially test for the rank r, starting with the hypothesis of a rank against the alternative of a rank , and increasing the assumed rank consecutively until the null hypothesis is not rejected.
Ensuring that generated from (4) is indeed an I(1) process, requires on the one hand that is of reduced rank, i.e., and on the other that the matrix
has full rank. It is well-known that condition (5) is fulfilled on the complement of a “thin” algebraic subset of , and is therefore, ignored in estimation, as it is “generically” fulfilled.10
The I(2) case is similar in structure to the I(1) case, but with two rank restrictions and one full rank condition to exclude even higher integration orders. The corresponding VECM is given by
with as defined in (4), as defined in (5) and , . From (5) we already know that reduced rank of
with , is required for higher integration orders. The condition for the corresponding solution process to be an I(2) process is given by full rank of
which again is typically ignored in estimation, just like condition (5) in the I(1) case. Thus, I(2) processes correspond to a “thin subset” of , which in turn constitutes a “thin subset” of . The fact that integrated processes correspond to “thin sets” in implies that obtaining estimated systems with specific integration and cointegration properties requires restricted estimation based on parameterizations tailor made to highlight these properties.
Already for the I(2) case, formulating parameterizations that allow conveniently studying the integration and cointegration properties is a quite challenging task. Johansen (1997) contains several different (re-)parameterizations for the I(2) case and Paruolo (1996) defines “integration indices”, say, as the number of columns of the matrices , and . Clearly, the indices are linked to the ranks of the above matrices and , as and and the columns of form a basis of , such that . It holds that is an I(2) process without cointegration and is an I(1) process without cointegration. The process is typically and in this case cointegrates with to stationarity. Thus, there is a direct correspondence of these indices to the dimensions of the different cointegrating spaces—both static and dynamic (with precise definitions given below in Definition 3).11 Please note that again, as already before in the I(1) case, different values of the integration indices , lead to parameter spaces of different dimensions. Furthermore, in these parameterizations matrices describing different cointegrating spaces are (i) not identified and (ii) linked by restrictions, compare the discussion in Paruolo (2000, sct. 2.2) and (7). These facts render the analysis of the cointegration properties in I(2) VAR systems complicated. Also, in the I(2) VAR case usually some forms of RRR estimators are considered over suitable subsets of , again based on implicit normalizations. Inference, however, again requires one to consider parameterizations explicitly.
Estimation and inference issues are fundamentally more complex in the VARMA case than in the VAR case. This stems from the fact that unrestricted estimation—unlike in the VAR case—is not possible due to a lack of identification, as discussed below. This means that in the VARMA case identification and parameterization issues need to be tackled as the first step, compare the discussion in Hannan and Deistler (1988, chp. 2).
In this paper, we consider VARMA processes as solutions of the vector difference equation
with , where for , and the non-explosiveness condition for . Similarly, , where for , and . The transfer function corresponding to a VARMA process is .
It is well-known that without further restrictions the VARMA realization of the transfer function is not identified, i.e., different pairs of polynomial matrices can realize the same transfer function . It is clear that for all non-singular polynomial matrices . Thus, the mapping attaching the transfer function to the pair of polynomial matrices is not injective.12
Consequently, we refer for given rational transfer function to the class as a class of observationally equivalent VARMA realizations of . To achieve identification requires to define a canonical form, selecting one member of each class of observationally equivalent VARMA realizations for a set of considered transfer functions. A first step towards a canonical form is to only consider left coprime pairs .13 However, left coprimeness is not sufficient for identification and thus further restrictions are required, leading to parameter vectors of smaller dimension than . A widely used canonical form is the (reverse) echelon canonical form, see Hannan and Deistler (1988, Theorem 2.5.1, p. 59), based on (monic) normalizations of the diagonal elements of and degree relationships between diagonal and off-diagonal elements as well as the entries in , which lead to zero restrictions. The (reverse) echelon canonical form in conjunction with a transformation to an error correction model was used in VARMA cointegration analysis in the I(1) case, e.g., in Poskitt (2006, Theorem 4.1), but, as for the VAR case, understanding the interdependencies of rank conditions already becomes complicated once one moves to the I(2) case.
In the VARMA case matters are further complicated by another well-known problem that makes statistical analysis considerably more involved compared to the VAR case. Although there exists a generalization of the autoregressive order to the VARMA case, such that any transfer function corresponding to a VARMA system has an order (with the precise definition given in the next section) it is known since Hazewinkel and Kalman (1976) that no continuous parameterization of all rational transfer functions of order n exists if . Therefore, if one wants to keep the above-discussed advantages that continuity of a parameterization provides, the set of transfer functions of order n, henceforth referred to as , has to be partitioned into sets on which continuous parameterizations exist, i.e., , for some index set G, as already mentioned in the introduction.14 For any given partitioning of the set it is important to understand the relationships between the different subsets , as well as the closures of the pieces , since in case of misspecification of points in cannot be avoided even asymptotically in, e.g., pseudo maximum likelihood estimation. These are more complicated issues in the VARMA case than in the VAR case, see the discussion in Hannan and Deistler (1988, Remark 1 after Theorem 2.5.3).
Based on these considerations, the following section provides and discusses a parameterization that focuses on unit root and cointegration properties, resorting to the state space framework that—as mentioned in the introduction—provides advantages for cointegration analysis. In particular, we derive an almost everywhere homeomorphic parameterization, based on partitioning the set of all considered transfer functions according to a multi-index that contains, among other elements, the state space unit root structure. This implies that certain cointegration properties are invariant for all systems corresponding to a subset , i.e., the parameterization allows to directly impose cointegration properties such as the “cointegration indices” of Paruolo (1996) mentioned before.
3. The Canonical Form and the Parameterization
As a first step we define the class of VARMA processes considered in this paper, using the differencing operator defined in (3):
Definition 1.
The s-dimensional real VARMA process has unit root structure with , if it is a solution of the difference equation
where is an m-dimensional deterministic sequence, and is a linearly regular stationary VARMA process, i.e., there exists a pair of left coprime matrix polynomials such that for a white noise process with , with furthermore for .
- The process is called unit root process with unit roots for , the set is the set of unit root frequencies and the integers are the integration orders.
- A unit root process with unit root structure , is an I(d)process.
- A unit root process with unit root structure is an MFI(1), process.
A linearly regular stationary VARMA process has empty unit root structure .
As discussed in Bauer and Wagner (2012) the state space framework is convenient for the analysis of VARMA unit root processes. Detailed treatments of the state space framework are given in Hannan and Deistler (1988) and—in the context of unit root processes—Bauer and Wagner (2012).
A state space representation of a unit root VARMA process is15
for a white noise process , , a deterministic process , and the unobserved state process , , , and .
Remark 3.
Bauer and Wagner (2012, Theorem 2) show that every real valued unit root VARMA process as given in (8) has a real valued state space representation with real valued and real valued system matrices . Considering complex valued state space representations in (9) is merely for algebraic convenience, as in general some eigenvalues of A are complex valued. Note for completeness that Bauer and Wagner (2012) contains a detailed discussion why considering the A-matrix in the canonical form in (up to reordering) the Jordan normal form is useful for cointegration analysis. For the sake of brevity we abstain from including this discussion again in the present paper. The key aspect of this construction is its usefulness for cointegration analysis, which becomes visible in Remark 4, where the “simple” unit root properties of blocks of the state vector are discussed.
The transfer function with real valued power series coefficients corresponding to a real valued unit root process as given in Definition 1 is given by the rational matrix function . The (possibly complex valued) matrix triple realizes the transfer function if and only if . Please note that as for VARMA realizations, for a transfer function there exist multiple state space realizations , with possibly different state dimensions n. A state space system is minimal if there exists no state space system of lower state dimension realizing the same transfer function . The order of the transfer function is the state dimension of a minimal system realizing .
All minimal state space realizations of a transfer function only differ in the basis of the state (cf. Hannan and Deistler 1988, Theorem 2.3.4), i.e., for two minimal state space systems and is equivalent to the existence of a regular matrix such that . Thus, the matrices A and are similar for all minimal realizations of a transfer function .
By imposing restrictions on the matrices of a minimal state space system realizing , Bauer and Wagner (2012, Theorem 2) provide a canonical form, i.e., a mapping of the set of transfer functions with real valued power series coefficients defined below onto unique state space realizations . The set is defined as
To describe the necessary restrictions of the canonical form the following definition is useful:
Definition 2.
A matrix is positive upper triangular (p.u.t.) if there exist integers , such that for we have , ; i.e., B is of the form
where the symbol * indicates unrestricted complex-valued entries.
A unique state space realization of is given as follows (cf. Bauer and Wagner 2012, Theorem 2):
Theorem 1.
For every transfer function there exists a unique minimal (complex) state space realization such that
with:
- (i)
- , where it holds for that
- -
- for :
- -
- for :
withwhere . - (ii)
- and are partitioned accordingly. It holds for that
- -
- for :
- -
- for :
- (iii)
- Partitioning in as , with it holds that is p.u.t. for for and .
- (iv)
- For define , with and for , with . Furthermore, define . It holds that and for for and .
- (v)
- and the stable subsystem of state dimension is in echelon canonical form (cf. Hannan and Deistler 1988, Theorem 2.5.2).
Remark 4.
As indicated in Remark 3 and discussed in detail in Bauer and Wagner (2012) considering complex valued quantities is merely for algebraic convenience. For econometric analysis, interest is, of course, on real valued quantities. These can be straightforwardly obtained from the representation given in Theorem 1 as follows. First define a transformation matrix (and its inverse):
Starting from the complex valued canonical representation , a real valued canonical representation
with real valued matrices follows from using the just defined transformation matrix. In particular it holds that:
with
Before we turn to the real valued state process corresponding to the real valued canonical representation, we first consider the complex valued state process in more detail. This process is partitioned according to the partitioning of the matrices into , where
with
For the sub-vectors are further decomposed into , with for according to the partitioning .
The partitioning of the complex valued process leads to an analogous partitioning of the real valued state process , , obtained from
with the corresponding block of the state equation given by
For the sub-vectors are further decomposed into , with if and if for and decomposed accordingly.
Bauer and Wagner (2012, Theorem 3, p. 1328) show that the processes have unit root structure for and . Furthermore, for and the processes are not cointegrated, as defined in Definition 3 below. For , the process is the -dimensional process of stochastic trends of order , while the components of , for , and the components of , for , are referred to as stochastic cycles of order at their corresponding frequencies .
Remark 5.
Parameterizing the stable part of the transfer function using the echelon canonical form is merely one possible choice. Any other canonical form of the stable subsystem and suitable parameterization based on it can be used instead for the stable subsystem.
Remark 6.
Starting from a state space system (9) with matrices in canonical form, a solution for (with the solution for obtained completely analogously)—for some —is given by
Clearly, the term is stochastically singular and is effectively like a deterministic component, which may lead to an identification problem with . If, the deterministic component is rich enough to “absorb” , then one solution of the identification problem is to set . Rich enough here means, e.g., in the I(1) case with that contains an intercept. Analogously, in the MFI(1) case has to contain seasonal dummy variables corresponding to all unit root frequencies. The term decays exponentially and, therefore, does not impact the asymptotic properties of any statistical procedure. It is, therefore, inconsequential for statistical analysis but convenient (with respect to our definition of unit root processes) to set . This corresponds to the steady state or stationary solution of the stable block of the state equation, and renders or, when the solution on is considered, stationary. Please note that these issues with respect to starting values, potential identification problems and their impact or non-impact on statistical procedures also occur in the VAR setting.
Bauer and Wagner (2012, Theorem 2) show that minimality of the canonical state space realization implies full row rank of the p.u.t. blocks of . In addition to proposing the canonical form, Bauer and Wagner (2012) also provide details how to transform any minimal state space realization into canonical form: Given a minimal state space system realizing the transfer function , the first step is to find a similarity transformation T such that is of the form given in (10) by using an eigenvalue decomposition, compare Chatelin (1993). In the second step the corresponding subsystem is transformed to echelon canonical form as described in Hannan and Deistler (1988, chp. 2). These two transformations do not lead to a unique realization, because the restrictions on do not uniquely determine the unstable subsystem.
For example, in the case , , , such that is a corresponding state space system, the same transfer function is realized also by all systems , with some regular matrix . To find a unique realization the product needs to be uniquely decomposed into factors and . This is achieved by performing a QR decomposition of (without pivoting) that leads to . The additional restriction of being a p.u.t. matrix of full row rank then leads to a unique factorization of into and . In the general case with an arbitrary unit root structure , similar arguments lead to p.u.t. restrictions on sub-blocks in and orthogonality restrictions on sub-blocks of .
The canonical form introduced in Theorem 1 was designed to be useful for cointegration analysis. To see this, first requires a definition of static and polynomial cointegration (cf. Bauer and Wagner 2012, Definitions 3 and 4).
Definition 3.
- (i)
- Let and be two unit root structures. Then if
- -
- .
- -
- For all for and k such that it holds that .
Furthermore, if and . For two unit root structures define the decrease of the integration order at frequency , for , as - (ii)
- An s-dimensional unit root process with unit root structure Ω is cointegrated of order , where , if there exists a vector , such that has unit root structure . In this case the vector β is a cointegrating vector (CIV) of order .
- (iii)
- (iv)
- An s-dimensional unit root process with unit root structure Ω is polynomially cointegrated of order , where , if there exists a vector polynomial , for some integer such that
- -
- has unit root structure ,
- -
- .
In this case the vector polynomial is a polynomial cointegrating vector (PCIV) of order . - (v)
- All PCIVs of order span the polynomial cointegrating space of order .
Remark 7.
- (i)
- It is merely a matter of taste whether cointegrating spaces are defined in terms of their order or their decrease , with as defined above. Specifying Ω and contains the same information as providing the order of (polynomial) cointegration.
- (ii)
- Notwithstanding the fact that CIVs and PCIVs in general may lead to changes of the integration orders at different unit root frequencies it may be of interest to “zoom in” on only one unit root frequency , thereby leaving the potential reductions of the integration orders at other unit root frequencies unspecified. This allows to—entirely similarly as in Definition 3—define cointegrating and polynomial cointegrating spaces of different orders at a single unit root frequency . Analogously one can also define cointegrating and polynomial cointegrating spaces of different orders for subsets of the frequencies in .
- (iii)
- In principle the polynomial cointegrating spaces defined so far are infinite-dimensional as the polynomial degree is not bounded. However, since every polynomial vector can be written as , where by definition has empty unit root structure, it suffices to consider PCIVs of polynomial degree smaller than the polynomial degree of . This shows that it is sufficient to consider finite dimensional polynomial cointegrating spaces. When considering, as in item (ii), (polynomial) cointegration only for one unit root it similarly suffices to consider polynomials of maximal degree equal to for real unit roots and for complex unit roots. Thus, in the I(2) case it suffices to consider polynomials of degree one.
- (iv)
- The argument about maximal relevant polynomial degrees given in item (iii) can be made more precise and combined with the decrease in Ω achieved. Every polynomial vector can be written as for . By definition it holds that has integration order at frequency . Thus, it suffices to consider PCIVs of polynomial degree smaller than for or for when considering the polynomial cointegrating space at with decrease . In the MFI(1) case therefore, when considering only one unit root frequency, again only polynomials of degree one need to be considered. This space is often referred to in the literature as dynamic cointegration space.
To illustrate the advantages of the canonical form for cointegration analysis consider
By Remark 4, the process is not cointegrated. This implies that reduces the integration order at unit root to if and only if and or equivalently and (using the transformation to the complex matrices of the canonical form, as discussed in Remark 4, and that if and only if ). Thus, the CIVs are characterized by orthogonality to sub-blocks of .
The real valued representation given in Remark 4 used in its partitioned form just above immediately leads to necessary orthogonality constraint for polynomial cointegration of degree one:
follows. Since all terms except the first are stationary or deterministic, a necessary condition for a reduction of the unit root structure is the orthogonality of to sub-blocks of or sub-blocks of the complex matrix . Please note, however, that this orthogonality condition is not sufficient for to be a PCIV, because it does not imply . For a detailed discussion of polynomial cointegration, when considering also higher polynomial degrees, see Bauer and Wagner (2012, sct. 5).
The following examples illustrate cointegration analysis in the state space framework for the empirically most relevant, i.e., the I(1), MFI(1) and I(2) cases.
Example 1 (Cointegration in the I(1) case).
In the I(1) case, neglecting the stable subsystem and the deterministic components for simplicity, it holds that
The vector , is a CIV of order if and only if .
Example 2 (Cointegration in the MFI(1) case with complex unit root ).
In the MFI(1) case with unit root structure and complex unit root , neglecting the stable subsystem and the deterministic components for simplicity, it holds that
The vector is a CIV of order if and only if
The vector polynomial , with is a PCIV of order if and only if
which is equivalent to
The fact that the matrix in (11) has a block structure with two blocks of conjugate complex columns implies some additional structure also on the space of PCIVs, here with polynomial degree one. More specifically it holds that if is a PCIV of order , also is a PCIV of order . This follows from
Thus, the space of PCIVs of degree (up to) one inherits some additional structure emanating from the occurrence of complex eigenvalues in complex conjugate pairs.
Example 3 (Cointegration in the I(2) case).
In the I(2) case, neglecting the stable subsystem and the deterministic components for simplicity, it holds that
The vector is a CIV of order if and only if
The vector is a CIV of order if and only if
The vector polynomial , with is a PCIV of order if and only if
The above orthogonality constraint indicates that the two cases and have to be considered separately for polynomial cointegration analysis. Consider first the case . In this case the orthogonality constraints imply , and . Thus, the vector is a CIV of order and therefore is of “non-minimum” degree, one in this case rather than zero (). For a formal definition of minimum degree PCIVs see Bauer and Wagner (2003, Definition 4). In case there are PCIVs of degree one that are not simple transformations of static CIVs. Consider such that is stationary. The integrated contribution to is given by , with . This term is eliminated by in , if , which is only possible if . Additionally, needs to hold, such that there is no further integrated contribution to . Neither nor are CIVs since both violate the necessary conditions given in the definition of CIVs, which implies that is indeed a “minimum degree” PCIV.
As was shown above, the unit root and cointegration properties of depend on the sub-blocks of and the eigenvalue structure of . We, therefore, define the more encompassing state space unit root structure containing information on the geometrical and algebraic multiplicities of the eigenvalues of (cf. Bauer and Wagner 2012, Definition 2).
Definition 4.
A unit root process with a canonical state space representation as given in Theorem 1 has state space unit root structure
where for . For with empty unit root structure .
Remark 8.
The state space unit root structure contains information concerning the integration properties of the process , since the integers , , describe (multiplied by two for k such that the numbers of non-cointegrated stochastic trends or cycles of corresponding integration orders, compare again Remark 4. As such, describes properties of the stochastic process —and, therefore, the state space unit root structure partitions unit root processes according to these (co-)integration properties. These (co-)integration properties, however, are invariant to a chosen canonical representation, or more generally invariant to whether a VARMA or state space representation is considered. For all minimal state representations of a unit root process these indices—being related to the Jordan normal form—are invariant.
As mentioned in Section 2, Paruolo (1996, Definition 3) introduces integration indices at frequency zero as a triple of integers . These correspond to the numbers of columns of the matrices in the error correction representation of I(2) VAR processes, see, e.g., Johansen (1997, sct. 3). Here, is the number of stochastic trends of order two, i.e., . Furthermore, is the number of stochastic trends of order one that do not cointegrate with and hence . Therefore, the integration indices at frequency zero are in one-one correspondence with the state space unit root structure for I(2) processes and the dimension of the process.
The canonical form given in Theorem 1 imposes p.u.t. structures on sub-blocks of the matrix . The occurrence of these blocks—related to —is determined by the state space unit root structure . The number of free entries in these p.u.t.-blocks, however, is not determined by . Consequently, we need structure indices indicating for each row the position of a potentially restricted positive element, as formalized below:
Definition 5 (Structure indices).
For the block of the matrix of a state space realization in canonical form, define the corresponding structure indices as
Remark 9.
Since sub-blocks of corresponding to complex unit roots are of the form , the entries restricted to be positive are located in the same columns and rows of both and . Thus, the structure indices of the corresponding rows are identical for and . Therefore, it would be possible to omit the parts of p corresponding to the blocks . It is, however, as will be seen in Definition 9, advantageous for the comparison of unit root structures and structure indices that p is a vector with entries.
Example 4.
Consider the following state space system:
In canonical form and are p.u.t. matrices and is unrestricted. If, e.g., the second entry of and the first entry of are restricted to be positive, then
where the symbol * denotes unrestricted entries. In this case .
For given state space unit root structure the matrix is fully determined. The parameterization of the set of feasible matrices for given structure indices p and of the set of stable subsystems for given Kronecker indices (cf. Hannan and Deistler 1988, chp. 2.) is straightforward, since the entries in these matrices are either unrestricted, restricted to zero or restricted to be positive. Matters are a bit more complicated for . One possibility to parameterize the set of possible matrices for a given state space unit root structure is to use real and complex valued Givens rotations (cf. Golub and van Loan 1996, chp. 5.1).
Definition 6 (Real Givens rotation).
The real Givens rotation , is defined as
Remark 10.
Givens rotations allow transforming any vector into a vector of the form with . This is achieved by the following algorithm:
- 1.
- Set , and .
- 2.
- Represent using polar coordinates as , with and . If , set (cf. Otto 2011, chp. 1.5.3, p. 39). Then such that , with .
- 3.
- If , stop. Else increment j by one () and continue at step 2.
This algorithm determines a unique vector for every vector .
Remark 11.
The determinant of real Givens rotations is equal to one, i.e., for all and all . Thus, it is not possible to factorize an orthonormal matrix Q with into a product of Givens rotations. This obvious fact has implications for the parameterization of -matrices as is detailed below.
Definition 7 (Complex Givens rotation).
The complex Givens rotation , , is defined as
Remark 12.
Complex Givens rotations allow transforming any vector into a vector of the form with . This is achieved by the following algorithm:
- 1.
- Set , and .
- 2.
- Represent using polar coordinates as , with and . If , set and if , set (cf. Otto 2011, chp. 8.1.3, p. 222).
- 3.
- SetThen such that , with .
- 4.
- If , stop. Else increment j by one () and continue at step 2.
This algorithm determines a unique vector for every vector .
To set the stage for the general case, we start the discussion of the parameterization of the set of matrices in canonical form with the MFI(1) and I(2) cases. These two cases display all ingredients required later for the general case. The MFI(1) case illustrates the usage of either real or complex Givens rotations, depending on whether the considered -block corresponds to a real or complex unit root. The I(2) case highlights recursive orthogonality constraints on the parameters of the -block, which are related to the polynomial cointegration properties (cf. Example 3).
3.1. The Parameterization in the MFI(1) Case
The state space unit root structure of an MFI(1) process is given by . For the corresponding state space system in canonical form, the sub-blocks of are equal to , the sub-blocks of are p.u.t. and , for .
Starting with the sub-blocks of , it is convenient to separate the discussion of the parameterization of -blocks into the real case, where and , and the complex case with and . For the case of real unit roots the two cases and have to be distinguished. For brevity of notation refer to the considered real block simply as . Using this notation, the set of matrices to be parameterized is
The parameterization of is based on the combination of real Givens rotations, as given in Definition 6, that allow transforming every matrix in to the form for . For , Givens rotations allow transforming every matrix either to or , since, compare Remark 11, for the transformed matrix it holds that . This is achieved with the following algorithm:
- Set and .
- Transform the entries in the j-th row of , to , . Since this is a row vector, this is achieved by right-multiplication of with transposed Givens rotations and the required parameters are obtained via the algorithm described in Remark 10. The first entries of the j-th row remain unchanged. Denote the transformed matrix by .
- If stop. Else increment j by one () and continue at step 2.
- Collect all parameters used for the Givens rotations in steps 1 to 3 in a parameter vector . Steps 1–3 correspond to a QR decomposition of , with an orthonormal matrix Q given by the product of the Givens rotations. Please note that the first entries of the j-th column of are equal to zero by construction.
- Set and .
- Collect the entries in column of which have not been transformed to zero by previous transformations into the vector . Using the algorithm described in Remark 10 transform this vector to by left-multiplication of with Givens rotations. Since Givens rotations are orthonormal, the transformed matrix is still orthonormal implying for its entries and for all . An exception occurs if . In this case and no Givens rotations are defined.
- If stop. Else increment j by one () and continue at step 6.
- Collect all parameters used for the Givens rotations in steps 5 to 7 in a parameter vector .
The parameter vector , contains the angles of the employed Givens rotations and provides one way of parameterizing . The following Lemma 1 demonstrates the usefulness of this parameterization.
Lemma 1 (Properties of the parameterization of ).
Define for a mapping from by
with , where and . The following properties hold:
- (i)
- is closed and bounded.
- (ii)
- The mapping is infinitely often differentiable.
For , it holds that
- (iii)
- For every there exists a vector such thatThe algorithm discussed above defines the inverse mapping .
- (iv)
- The inverse mapping —the parameterization of —is infinitely often differentiable on the pre-image of the interior of . This is an open and dense subset of .
For , it holds that
- (v)
- is a disconnected space in with two disjoint non-empty closed subsets and .
- (vi)
- For every there exists a vector such thatIn this case, steps 1-4 of the algorithm discussed above define the inverse mapping .
- (vii)
- Define . Then a parameterization of is given byThe parameterization is infinitely often differentiable with infinitely often differentiable inverse on an open and dense subset of .
Remark 13.
The following arguments illustrate why is not continuous on the pre-image of the boundary of : Consider the unit sphere . One way to parameterize the unit sphere is to use degrees of longitude and latitude. Two types of discontinuities occur: After fixing the location of the zero degree of longitude, i.e., the prime meridian, its anti-meridian is described by both 180W and 180E. Using the half-open interval in our parametrization causes a similar discontinuity. Second, the degree of longitude is irrelevant at the north pole. As seen in Remark 10, with our parameterization a similar issue occurs when the first two entries of C to be compared are both equal to zero. In this case the parameter of the Givens rotation is set to zero, although every θ will produce the same result. Both discontinuities clearly occur on a thin subset of .
As in the parametrization of the VAR I(1)-case in the VECM framework, where the restriction can only be imposed when the upper block of the true of the DGP is of full rank (cf. Johansen 1995, chp. 5.2), the set where the discontinuities occur can effectively be changed by a permutation of the components of the observed time series. This corresponds to redefining the locations of the prime meridian and the poles.
Remark 14.
Please note that the parameterization partitions the parameter vector θ into two parts and . Since changing the parameter values in does not change the column space of , which, as seen above, determines the cointegrating vectors, fully characterizes the (static) cointegrating space. Please note that the dimension of is and thus coincides with the number of free parameters in β in the VECM framework (cf. Johansen 1995, chp. 5.2).
Example 5.
Consider the matrix
with and . As discussed, the static cointegrating space is characterized by the left kernel of this matrix. The left kernel of a matrix in with full rank two is given by a one-dimensional space, with the corresponding basis vector parameterized, when normalized to length one, by two free parameters. Thus, for the characterization of the static cointegrating space two parameters are required, which exactly coincides with the dimension of given in Remark 14. The parameters in correspond to the choice of a basis of the image of C. Having fixed the two-dimensional subspace through , only one free parameter for the choice of an orthonormal basis remains, which again coincides with the dimension given in Remark 14. To obtain the parameter vector, the starting point is a QR decomposition of . In this example , with to be determined. To find , solve for and . In other words, find and such that , which leads to , . Thus, the orthonormal matrix is equal to and the transpose of the upper triangular matrix is equal to:
Second, transform the entries in the lower -sub-block of to zero, starting with the last column. For this find such that , i.e., . This yields , . Next compute :
In the final step find such that , i.e., . The solution is , . Combining the transformations leads to
The parameter vector for this matrix is therefore with .
In case of complex unit roots, referring for brevity again to the considered block simply as , the set of matrices to be parameterized is
The parameterization of this set is based on the combination of complex Givens rotations, as given in Definition 7, which can be used to transform every matrix in to the form with a diagonal matrix whose diagonal elements are of unit modulus. This transformation is achieved with the following algorithm:
- Set and .
- Transform the entries in the j-th row of , to . Since this is a row vector, this is achieved by right-multiplication of with transposed Givens rotations and the required parameters are obtained via the algorithm described in Remark 12. The first entries of the j-th row remain unchanged. Denote the transformed matrix by .
- If stop. Else increment j by one () and continue at step 2.
- Collect all parameters used for the Givens rotations in steps 1 to 3 in a parameter vector . Step 1–3 corresponds to a QR decomposition of , with a unitary matrix Q given by the product of the Givens rotations. Please note that the first entries of the j-th column of are equal to zero by construction.
- Set and .
- Collect the entries in column of which have not been transformed to zero by previous transformations into the vector . Using the algorithm described in Remark 12 transform this vector to by left-multiplication of with Givens rotations. Since Givens rotations are unitary, the transformed matrix is still unitary implying for its entries and for all . An exception occurs if . In this case and no Givens rotations are defined.
- If stop. Else increment j by one () and continue at step 6.
- Collect all parameters used for the Givens rotations in steps 5 to 7 in a parameter vector .
- Transform the diagonal entries of the transformed matrix into polar coordinates and collect the angles in a parameter vector .
The following lemma demonstrates the usefulness of this parameterization.
Lemma 2 (Properties of the parametrization of ).
Define for a mapping from by
with , where , and and where . The following properties hold:
- (i)
- is closed and bounded.
- (ii)
- The mapping is infinitely often differentiable.
- (iii)
- For every a vector exists such thatThe algorithm discussed above defines the inverse mapping .
- (iv)
- The inverse mapping —the parameterization of —is infinitely often differentiable on an open and dense subset of .
Remark 15.
Note the partitioning of the parameter vector φ into the parts , and . The component fully characterizes the column space of , i.e., determines the cointegrating spaces.
Example 6.
Consider the matrix
The starting point is again a QR decomposition of . To find a complex Givens rotation such that with , transform the entries of into polar coordinates. The equation has the solutions and . Using the results of Remark 12, the parameters of the Givens rotation are and . Right-multiplication of C with leads to
Since the entries in the lower -sub-block of are already equal to zero, the remaining complex Givens rotations are . Finally, the parameter values corresponding to the diagonal matrix are and .
The parameter vector for this matrix is therefore , with .
Components of the Parameter Vector
Based on the results of the preceding sections we can now describe the parameter vectors for the general case. The dimensions of the parameter vectors of the respective blocks of the system matrices depend on the multi-index , consisting of the state space unit root structure , the structure indices p and the Kronecker indices for the stable subsystem. A parameterization of the set of all systems in canonical form with given multi-index for the MFI(1) case, therefore, combines the following components:
- , with:for , with denoting the j-th entry of the structure indices p corresponding to . The vectors contain the real and imaginary parts of free entries in not restricted by the p.u.t. structures.
- : The vectors contain the entries in restricted by the p.u.t. structures to be positive reals.
- : The parameters for the matrices as discussed in Lemma 1 and Lemma 2.
- : The parameters for the stable subsystem in echelon canonical form for Kronecker indices .
Example 7.
Consider an MFI(1) process with , , , and system matrices
in canonical form. For this example it holds that , and
with parameter values corresponding to the C-blocks collected in considered in Examples 5 and 6.
3.2. The Parameterization in the I(2) Case
The canonical form provided above for the general case has the following form for I(2) processes with unit root structure :
where , and are p.u.t., , , , , and is in echelon canonical form with Kronecker indices . All matrices are real valued.
The parameterizations of the p.u.t. matrices and are as discussed above. The entries of are unrestricted and thus included in the parameter vector containing also the free entries in and . The subsystem is parameterized using the echelon canonical form.
The parameterization of proceeds as in the MFI(1) case, using . The parameterization of has to take the restriction of orthogonality of to into account, thus the set to be parameterized is given by
The parameterization of this set again uses real Givens rotations. For it follows that for a matrix such that with corresponding to . The matrix is parameterized as discussed in Lemma 1.
Corollary 1 (Properties of the parameterization of ).
Define for a mapping from by
where denotes the parameter values corresponding to as defined in Lemma 1. The following properties hold:
- (i)
- is closed and bounded.
- (ii)
- The mapping is infinitely often differentiable.
For , it holds
- (iii)
- For every there exists a vector such thatThe algorithm discussed above Lemma 1 defines the inverse mapping .
- (iv)
- The inverse mapping —the parameterization of —is infinitely often differentiable on the pre-image of the interior of . This is an open and dense subset of .
For , it holds that
- (v)
- is a disconnected space with two disjoint non-empty closed subsets:
- (vi)
- For every there exists a vector such thatSteps 1–4 of the algorithm discussed above Lemma 1 define the inverse mapping .
- (vii)
- Define . Then a parameterization of is given byThe parameterization is infinitely often differentiable with infinitely often differentiable inverse on an open and dense subset of .
The proof of Corollary 1 uses the same arguments as the proof of Lemma 1 and is, therefore, omitted. It remains to provide a parameterization for restricted to be orthogonal to both and . Thus, the set to be parametrized is given by
The parameterization of is straightforward: Left multiplication of with as defined in Lemma 1 and of the lower - block with as defined in Corollary 1 transforms the upper -block to zero and collects the free parameters in the lower -block. Clearly, this is a bijective and infinitely often differentiable mapping on and thus a useful parameterization, since the matrix is only multiplied with two constant invertible matrices. The entries of the matrix product are then collected in a parameter vector as shown in Corollary 2.
Corollary 2 (Properties of the parameterization of ).
Define for given matrices and a mapping from by
where denotes the parameter values corresponding to as defined in Lemma 1 and denotes the parameter values corresponding to as defined in Corollary 1. The set is closed and both as well as —the parameterization of —are infinitely often differentiable.
Components of the Parameter Vector
In the I(2) case, the multi-index contains the state space unit root structure , the structure indices , encoding the p.u.t. structures of and , and the Kronecker indices for the stable subsystem. The parameterization of the set of all systems in canonical form with given multi-index for the I(2) case uses the following components:
- : The vector contains the free entries in not restricted by the p.u.t. structure, collected in the same order as for the matrices in the MFI(1) case.
- : The vector contains the entries in restricted by the p.u.t. structures to be positive reals.
- : The parameters for the matrices as in the MFI(1) case and as discussed in Corollary 1.
- : The parameters for the matrix as discussed in Corollary 2.
- : The parameters for the stable subsystem in echelon canonical form for Kronecker indices .
Example 8.
Consider an I(2) process with , , and system matrices
In this case, , . It follows from
that and .
3.3. The Parameterization in the General Case
Inspecting the canonical form shows that all relevant building blocks are already present in the MFI(1) and the I(2) cases and can be combined to deal with the general case: The entries in are either unrestricted or follow restrictions according to given structure indices p, and the parameter space is chosen accordingly, as discussed for the MFI(1) and I(2) cases. The restrictions on the matrices and its blocks require more sophisticated parameterizations of parts of unitary or orthonormal matrices as well as of orthogonal complements. These are dealt with in Lemmas 1 and 2 and Corollaries 1 and 2 above. The extension of Corollaries 1 and 2 to complex matrices and to matrices which are orthogonal to a larger number of blocks of is straightforward.
The following theorem characterizes the properties of parameterizations for sets of transfer functions with (general) multi-index and describes the relations between sets of transfer functions and the corresponding sets of triples of system matrices in canonical form, defined below. Discussing the continuity and differentiability of mappings on sets of transfer functions and on sets of matrix triples also requires the definition of a topology on both sets.
Definition 8.
- (i)
- The set of transfer functions of order n, , is endowed with the pointwise topology : First, identify transfer functions with their impulse response sequences. Then, a sequence of transfer functions converges in to if and only if for every it holds that .
- (ii)
- The set of all triples in canonical form corresponding to transfer functions with multi-index Γ is called . The set is endowed with the topology corresponding to the distance .
Please note that in the definition of the pointwise topology convergence does not need to be uniform in j and moreover, the power series coefficients do not need to converge to zero for and hence the concept can also be used for unstable systems.
Theorem 2.
The set can be partitioned into pieces , where , i.e.,
where , with for and for is the state dimension of the unstable subsystem with state space unit root structure and is the state dimension of the stable subsystem with Kronecker indices .
For every multi-index Γ there exists a parameter space for some integer , endowed with the Euclidean norm, and a function , such that for every the parameter vector is composed of:
- The parameter vector , collecting the (real and imaginary parts of) non-restricted entries in as described in the MFI(1) case.
- The parameter vector , collecting the entries in restricted by the p.u.t. forms to be positive reals in a similar fashion as described for in the I(2) case.
- The parameter vector , collecting the parameters for all blocks , and , obtained using Givens rotations (see Lemmas 1 and 2 and Corollary 1 and its extension to complex matrices).
- The parameter vector , collecting the parameters (real and imaginary parts for complex roots) for , and , subject to the orthogonality restrictions (see Corollary 2 and its extension to complex matrices).
- The parameter vector collecting the free entries in echelon canonical form with Kronecker indices .
- (i)
- The mapping that attaches a triple in canonical form to a transfer function in is continuous. It is the inverse (restricted to ) of the -continuous function .
- (ii)
- Every parameter vector corresponds to a triple and a transfer function . The mapping is continuous on .
- (iii)
- For every multi-index Γ the set of points in , where the mapping is continuous, is open and dense in .
As mentioned in Section 2, the parameterization of is straightforward. The entries of are collected in a parameter vector . Thus, there is a one-to-one correspondence between state space realizations and parameter vectors . The same holds true for parameters used for the symmetric, positive definite innovation matrix obtained, e.g., from a lower triangular Cholesky factor of .
4. The Topological Structure
The parameterization of in Theorem 2 partitions into subsets for a selection of multi-indices . To every multi-index there exists a corresponding associated parameter set . Thus, in practical applications, maximizing the pseudo likelihood requires choosing the multi-index . Maximizing the pseudo likelihood over the set effectively amounts to including also all elements in the closure of , because of continuity of the parameterization. It is thus necessary to characterize the closures of the sets .
Moreover, maximizing the pseudo likelihood function over all possible multi-indices is time-consuming and not desirable. Fortunately, the results discussed below show that there exists a generic multi-index such that . This generic choice corresponds to the set of all stable systems of order n corresponding to the generic neighborhood of the echelon canonical form. This multi-index, therefore, is a natural starting point for estimation.
However, in particular for hypotheses testing, it will be necessary to maximize the pseudo likelihood over sets of transfer functions of order n with specific state space unit root structure , denoted as below, where denotes the dimension of the stable part of the state. We show below that also in this case there exists a generic multi-index such that .
The main tool to obtain these results is investigating the properties of the mappings , that map transfer functions in to triples , as well as analyzing the closures of the sets . The relation between parameter vectors and triples of system matrices is easier to understand than the relation between and , due to the results of Theorem 2. Consequently, this section focuses on the relations between and —and their closures—for different multi-indices .
To define the closures we embed the sets of matrices in canonical form with multi-indices corresponding to transfer functions of order n into the space of all conformable complex matrix triples with , where additionally . Since the elements of are matrix triples, this set is isomorphic to a subset of the finite dimensional space , equipped with the Euclidean topology. Please note that also contains non-minimal state space realizations, corresponding to transfer functions of lower order.
Remark 16.
In principle the set also contains state space realizations of transfer functions with complex valued coefficients . Since the subset of of state space systems realizing transfer functions with real valued is closed in , realizations corresponding to transfer functions with coefficients with non-zero imaginary part are irrelevant for the analysis of the closures of the sets .
After investigating the closure of in , denoted by , we consider the set of corresponding transfer functions . Since we effectively maximize the pseudo likelihood over , we have to understand for which multi-indices the set is a subset of . Moreover, we find a covering of . This restricts the set of multi-indices that may occur as possible multi-indices of the limit of a sequence in and thus the set of transfer functions that can be obtained by maximization of the pseudo likelihood.
The sets , are embedded into the vector space M of all causal transfer functions . The vector space M is isomorphic to the infinite dimensional space equipped with the pointwise topology. Since, as mentioned above, maximization of the pseudo likelihood function over effectively includes , it is important to determine for any given multi-index , the multi-indices for which the set is a subset of . Please note that is not necessarily equal to . The continuity of , as shown in Theorem 2 (i), implies the following inclusions:
In general all these inclusions are strict. For a discussion in case of stable transfer functions see Hannan and Deistler (1988, Theorem 2.5.3).
We first define a partial ordering on the set of multi-indices . Subsequently we examine the closure in and finally we examine the closures in M.
Definition 9.
- (i)
- For two state space unit root structures and with corresponding matrices and in canonical form, it holds that if and only if there exists a permutation matrix S such thatMoreover, holds if additionally .
- (ii)
- For two state space unit root structures and and dimensions of the stable subsystems we defineStrict inequality holds, if at least one of the two inequalities above holds strictly.
- (iii)
- For two pairs and with corresponding matrices and in canonical form, it holds that if and only if there exists a permutation matrix S such thatwhere and restricts at least as many entries as , i.e., holds for all . Moreover, holds if additionally .
- (iv)
- Let and . Then if and only if . Moreover, holds, if at least one inequality is strict (compare Hannan and Deistler 1988, sct. 2.5).
Finally, define
Strict inequality holds, if at least one of the inequalities above holds strictly.
Please note that (i) implies that only contains unit roots that are also contained in , with the integration orders of the unit roots in smaller or equal to the integration orders of the respective unit roots in . Thus, denoting the unit root structures corresponding to and by and , it follows that implies . The reverse does not hold as, e.g., for (where hence ) and (with ) it holds that , but neither nor holds as here
This partial ordering is convenient for the characterization of the closure of .
4.1. The Closure of in
Please note that the block-structure of implies that every system in can be separated in two subsystems and . Define as the set of all state space realizations in canonical form corresponding to state space unit root structure , structure indices p and . Analogously define as the set of all state space realizations in canonical form with and Kronecker indices . Examining and separately simplifies the analysis.
4.1.1. The Closure of
The canonical form imposes a lot of structure, i.e., restrictions on the matrices and . By definition and the closures of the three matrices can be analyzed separately. and are very easy to investigate. The structure of is fully determined by and consequently consists of a single matrix which immediately implies that . The matrix , compare Theorem 1 is composed of blocks that are sub-blocks of unitary (or orthonormal) matrices and blocks that have to fulfill (recursive) orthogonality constraints. The corresponding sets were shown to be closed in Lemmas 1 and 2 and Corollaries 1 and 2. Thus, .
It remains to discuss . The structure indices p defining the p.u.t. structures of the matrices restrict some entries to be positive. Combining all the parameters—unrestricted with complex values parameterized by real and imaginary part and the positive entries—into a parameter vector leads to an open sub-set of for some m. For convergent sequences of systems with fixed and p, limits of entries restricted to be positive may be zero. When this happens, two cases have to be distinguished. First, all p.u.t. sub-matrices still have full row rank. In this case the limiting system, say, is still minimal and can be transformed to a system in canonical form with fewer unrestricted entries in .
Second, if at least one of the row ranks of the p.u.t. blocks decreases in the limit, the limiting system is no longer minimal. Consequently, in the limit.
To illustrate this point consider again Example 4 with Equation (12) rewritten as
If and , , it holds that is an I(2) process with state space unit root structure .
Now consider a sequence of systems with all parameters except for constant and . The limiting system is then given by
In the limiting system is redundant and is an I(1) process rather than an I(2) process. Dropping leads to a state space realisation of the limiting system given by
In case has full rank, the above system is minimal. Since , the matrix needs to be transformed into p.u.t. format. By definition all systems in the sequence, with , have structure indices as discussed in Example 12. The limiting system—in case of full rank of —has indices . To relate to Definition 9 choose the permutation matrix to arrive at
This shows that , and thus the limiting system has a smaller multi-index than the systems of the sequence. In case has reduced rank equal to one a further reduction in the system order to along similar lines as discussed is possible, again leading to a limiting system with smaller multi-index .
The discussion shows that the closure of is related to lower order systems in the sense of Definition 9. The precise statement is given in Theorem 3 after a discussion of the closure of the stable subsystems.
4.1.2. The Closure of
Consider a convergent sequence of systems in and denote the limiting system by . Clearly, holds true for the limit of the sequence with for all j. Therefore, two cases have to be discussed for the limit:
- If , the potentially non-minimal limiting system corresponds to a minimal state space realization with Kronecker indices smaller or equal to (cf. Hannan and Deistler 1988, Theorem 2.5.3).
- If , the limiting matrix is similar to a block matrix , where all eigenvalues of have unit modulus and .
The first case is well understood, compare Hannan and Deistler (1988, chp. 2), since the limit in this case corresponds to a stable transfer function. In the second case the limiting system can be separated into two subsystems and , according to the block diagonal structure of . The state space unit root structure of the limiting system depends on the multiplicities of the eigenvalues of the matrix and is greater (in the sense of Definition 9) than the empty state space unit root structure. At the same time the Kronecker indices of the subsystem are smaller than , compare again Hannan and Deistler (1988, chp. 2). Since the Kronecker indices impose restrictions on some entries of the matrices and thus also on , the block and consequently also the limiting state space unit root structure might be subject to further restrictions.
4.1.3. The Conformable Index Set and the Closure of
The previous subsection shows that the closure of does not only contain systems corresponding to transfer functions with multi-index smaller or equal to , but also systems that are related in a different way that is formalized below.
Definition 10 (Conformable index set).
Given a multi-index , the set of conformable multi-indices contains all multi-indices , where:
- The pair with corresponding matrix in canonical form extends with corresponding matrix in canonical form, i.e., there exists a permutation matrix S such that
- .
- .
Please note that the definition implies . The importance of the set is clarified in the following theorem:
Theorem 3.
Transfer functions corresponding to state space realizations with multi-index are contained in the set . The set is contained in the union of all sets for with conformable to Γ, i.e.,
Theorem 3 provides a characterization of the transfer functions corresponding to systems in the closure of . The conformable set plays a key role here, since it characterizes the set of all minimal systems that can be obtained as limits of convergent sequences from within the set . Conformable indices extend the matrix corresponding to the unit root structure by the block .
The second inclusion in Theorem 3 is potentially strict, depending on the Kronecker indices in . Equality holds, e.g., in the following case:
Corollary 3.
For every multi-index Γ with the set of conformable indices consists only of Γ, which implies .
4.2. The Closure of
It remains to investigate the closure of in M. Hannan and Deistler (1988, Theorem 2.6.5 (ii) and Remark 3, p. 73) show that for any order n, there exist Kronecker indices corresponding to the generic neighborhood for transfer functions of order n such that
where . Here denotes the set of all transfer functions of order n with state space realizations satisfying . Every transfer function in can be approximated by a sequence of transfer functions in .
It can be easily seen that a generic neighborhood also exists for systems with state space unit root structure and without stable subsystem: Set the structure indices p to have a minimal number of elements restricted in p.u.t. sub-blocks of , i.e., for any block , or in case of a real unit root, set the corresponding structure indices to . Any p.u.t. matrix can be approximated by a matrix in this generic neighborhood with some positive entries restricted by the p.u.t. structure tending to zero. Combining these results with Theorem 3 implies the existence of a generic neighborhood for the canonical form considered in this paper:
Theorem 4.
Let be the set of all transfer functions with state space unit root structure . For every and , there exists a multi-index such that
Moreover, it holds that for every and satisfying .
Theorem 4 is the basis for choosing a generic multi-index for maximizing the pseudo likelihood function. For every and there exists a generic piece that—in its closure—contains all transfer functions of order and state space unit root structure : The set of transfer functions corresponding to the multi-index with the largest possible structure indices p in the sense of Definition 9 (iii) and generic Kronecker indices for the stable subsystem. Choosing these sets and their corresponding parameter spaces as model sets is, therefore, the most convenient choice for numerical maximization, if only and are known.
If, e.g., only an upper bound for the system order n is known and the goal is only to obtain consistent estimators, using is a feasible choice, since all transfer functions in the closure of the set can be approximated arbitrarily well, regardless of their potential state space unit root structure , . For testing hypotheses, however, it is important to understand the topological relations between sets corresponding to different multi-indices . In the following we focus on the multi-indices for arbitrary and .
The closure of contains also transfer functions that have a different state space unit root structure than . Considering convergent sequences of state space realizations of transfer functions in , the state space unit root structure of may differ in three ways:
- For sequences in canonical form rows of can tend to zero, which reduces the state space unit root structure as discussed in Section 4.1.1.
- Stable eigenvalues of may converge to the unit circle, thereby extending the unit root structure.
- Off-diagonal entries of the sub-block of may be converging to zeros in the sub-block of the limit in canonical form, resulting in a different attainable state space unit root structure. Here for all are regular matrices transforming to canonical form and transforms accordingly.
The first change of described above results in a transfer function with smaller state space unit root structure according to Definition 9 (ii). The implications of the other two cases are summarized in the following definition:
Definition 11 (Attainable unit root structures).
For given and the set of attainable unit root structures contains all pairs , where with corresponding matrix in canonical form extends with corresponding matrix in canonical form, i.e., there exists a permutation matrix S such that
where can be obtained by replacing off-diagonal entries in by zeros and where with the dimension of .
Remark 17.
It is a direct consequence of the definition of that implies .
Theorem 5.
- (i)
- is -open in (see Definition 8 for a definition of ).
- (ii)
- For every generic multi-index corresponding to and it holds that
Theorem 5 has important consequences for statistical analysis, e.g., PML estimation, since—as stated several times already—maximizing the pseudo likelihood function over effectively amounts to calculating the supremum over the larger set . Depending on the choice of the following asymptotic behavior may occur:
- If is chosen correctly and the estimator of the transfer function is consistent, openness of in its closure implies that the probability of the estimator being an interior point of tends to one asymptotically. Since the mapping attaching the parameters to the transfer function is continuous on an open and dense set, consistency in terms of transfer functions, therefore, implies generic consistency of the parameter estimators.
- If the multi-index is incorrectly chosen to equal , estimator consistency is still possible if the true multi-index , as in this case . This is in some sense not too surprising and something that is also well-known in the simpler VAR framework where consistency of OLS can be established when the true autoregressive order is smaller than the order chosen for estimation. Analogous to the lag number in the VAR case, thus, a necessary condition for consistency is to choose the system order larger or equal to the true system order.
Finally, note that Theorem 5 also implies the following result relevant for the determination of the unit root structure, further discussed in Section 5.1.1 and Section 5.2.1:
Corollary 4.
For every pair it holds that
5. Testing Commonly Used Hypotheses in the MFI(1) and I(2) Cases
This section discusses a large number of hypotheses, respectively restrictions, on cointegrating spaces, adjustment coefficients and deterministic components often tested in the empirical literature. As with the VECM framework, as discussed for the I(2) case in Section 2, testing hypotheses on the cointegrating spaces or adjustment coefficients may necessitate different reparameterizations.
5.1. The Case
The two by far most widely used cases of MFI(1) processes are processes and seasonally (co-)integrated processes for quarterly data with state space unit root structure . In general, assuming for notational simplicity and , it holds that for and we have
The above equation provides an additive decomposition of into stochastic trends and cycles, the deterministic and stationary components. The stochastic cycles at frequency are, of course, given by the combination of sine and cosine terms. For the MFI(1) case this can also be seen directly from considering the real valued canonical form discussed in Remark 4, with the matrices for , given by in this case.
The ranks of are equal to the integers in . The number of stochastic trends is equal to , the number of stochastic cycles at frequency is equal to for and equal to if , as discussed in Section 3.
Moreover, in the MFI(1) case, is linked to the complex cointegrating rank at frequency , defined in Johansen (1991) and Johansen and Schaumburg (1999) in the VECM case as the rank of the matrix . For VARMA processes with arbitrary integration orders the complex cointegrating rank at frequency is , where is the transfer function, with in the MFI(1) case. Thus, in the MFI(1) case, determination of the state space unit root structure corresponds to determination of the complex cointegrating ranks in the VECM case.
In the VECM setting, the matrix is usually factorized into , as presented for the I(1) case in Section 2. For the column space of gives the cointegrating space of the process at frequency . For the relation between the column space of and the space of CIVs and PCIVs at the corresponding frequency is more involved. The columns of are orthogonal to the columns of , the sub-block of from a state space realization in canonical form corresponding to the VAR process. Analogously, the column space of the matrix , containing the so-called adjustment coefficients, is orthogonal to the row space of the sub-block of .
Both integers and are related to the dimensions of the static and dynamic cointegrating spaces in the MFI(1) case: For , the cointegrating rank coincides with the dimension of the static cointegrating space at frequency . Furthermore, the dimension of the static cointegrating space at frequency is bounded from above by , since it is spanned by at most vectors orthogonal to the complex valued matrix . The dimension of the dynamic cointegrating space at is equal to . Identifying again with the vector , a basis of the dynamic cointegrating space at is then given by the column space of the product
with the columns of spanning the orthogonal complement of the column space of , i.e., is of full rank and . This holds true, since both factors are of full rank and satisfies , which corresponds to the necessary condition given in Example 2 for the columns of to be PCIVs. The latter implies also for , highlighting again the additional structure of the cointegrating space emanating from the complex conjugate pairs or eigenvalues (and matrices) as discussed in Example 2.
Please note that the relations between and discussed above only hold in the MFI(1) and I(1) special cases. For higher orders of integration no such simple relations exist.
In the MFI(1) setting the deterministic component typically includes a constant, seasonal dummies and a linear trend. As discussed in Remark 6, a sufficiently rich set of deterministic components allows to absorb non-zero initial values .
5.1.1. Testing Hypotheses on the State Space Unit Root Structure
Using the generic sets of transfer functions presented in Theorem 4, we can construct pseudo likelihood ratio tests for different hypotheses against chosen alternatives. Note, however, that by the results of Theorem 5 the null hypothesis includes all pairs as well as all pairs that are smaller than a pair .
As common in the VECM setting, first consider hypotheses at a single frequency . For an MFI(1) process, the hypothesis of a state space unit root structure equal to corresponds to the hypothesis of the (compex) cointegrating rank at frequency being equal to . Maximization of the pseudo likelihood function over the set – with a suitably chosen order n—leads to estimates that may be arbitrary close to transfer functions with different state space unit root structures . These include with additional unit root frequencies , with the integers restricted only by the order n. Therefore, focusing on a single frequency does not rule out a more complicated true state space unit root structure. Assume with for and else. Corollary 4 shows that
since, e.g., .
Analogously to the procedure of testing for the complex cointegrating rank in the VECM setting, these inclusions can be employed to test for : Start with the hypothesis of against the alternative of and decrease the assumed consecutively until the test does not reject the null hypothesis.
Furthermore, one can formulate hypotheses on jointly at different frequencies . Again, there exist inclusions based on the definition of the set of attainable state space unit root structures and Corollary 4, which can be used to consecutively test hypotheses on .
5.1.2. Testing Hypotheses on CIVs and PCIVs
Johansen (1995) considers in the case three types of hypotheses on the cointegrating space spanned by the columns of that are each motivated by examples from economic research: The different cases correspond to different types of hypotheses related to restrictions implied by economic theory.
- (i)
- : The cointegrating space is known to be a subspace of the column space of H (which is of full column rank).
- (ii)
- : Some cointegrating relations are known.
- (iii)
- , for such that . Cointegrating relations are known to be in the column spaces of matrices (which are of full column rank).
As discussed in Example 1, cointegration at occurs if and only if a vector satisfies . In other words, the column space of is the orthocomplement of the cointegrating space spanned by the columns of and hypotheses on restrict entries of .
The first type of hypothesis, , implies that the column space of is equal to the orthocomplement of the column space of . Assume w.l.o.g. , and , such that the columns of form an orthonormal basis for the orthocomplement of the cointegrating space. Consider now the mapping:
where and as in Lemma 1. From this one can derive a parameterization of the set of matrices corresponding to , analogously to Lemma 1. The difference of the number of free parameters under the null hypothesis and under the alternative is the difference between the number of free parameters in and , implying a reduction of the number of free parameters of under the null hypothesis. This necessarily coincides with the number of degrees of freedom of the corresponding test statistic in the VECM setting (cf. Johansen 1995, Theorem 7.2).
The second type of hypothesis, , is also straightforwardly parameterized: In this case a subspace of the cointegrating space is known and given by the column space of . Assume w.l.o.g. . The orthocomplement of is given by the set of matrices satisfying the restriction , i.e., the set defined in (13). The parameterization of this set has already been discussed. The reduction of the number of free parameters under the null hypothesis is which again coincides with the number of degrees of freedom of the corresponding test statistic in the VECM setting (cf. Johansen 1995, Theorem 7.3).
Finally, the third type of hypothesis, , is the most difficult to parameterize in our setting. As an illustrative example consider the case and . W.l.o.g. choose such that its columns span the -dimensional intersection of the column spaces of and and choose such that the columns of and span the column space of . Define , with . Let w.l.o.g. and define , for and . A parameterization of satisfying the restrictions under the null hypothesis can be derived from the following mapping:
where as in Lemma 1 and is a product of Givens rotations corresponding to the entries in the blocks highlighted by bold font. The three matrices are defined as follows:
Consequently, a parameterization of the orthocomplement of the cointegrating space is based on the mapping:
where as above and as in Lemma 1. Please note that for all , and it holds that . The number of parameters restricted under is equal to , and thus, through and , depends on the dimension of the intersection of the columns spaces of and . The reduction of the number of free parameters matches the degrees of freedom of the test statistics in Johansen (1995, Theorem 7.5), if is identified, which is the case if and .
Using the mapping as a basis for a parameterization allows to introduce another type of hypotheses of the form:
- (iv)
- , for such that . The ortho-complement of the cointegrating space is contained in the column spaces of the (full rank) matrices .
This type of hypothesis allows, e.g., to test for the presence of cross-unit cointegrating relations (cf. Wagner and Hlouskova 2009, Definition 1) in, e.g., multi-country data sets.
Hypotheses on the cointegrating space at frequency can be treated analogously to hypotheses on the cointegrating space at frequency .
Testing hypotheses on cointegrating spaces at frequencies has to be discussed in more detail, as one also has to consider the space spanned by PCIVs, compare Example 2. There are linearly independent PCIVs of the form . Every PCIV corresponds to a vector orthogonal to and consequently hypotheses on the space spanned by PCIVs can be transformed to hypotheses on the complex column space of .
Consider, e.g., an extension of the first type of hypothesis of the form
with , , which implies that the column space of is equal to the orthocomplement of the column space of . This general hypothesis encompasses, e.g., the hypothesis , with , by setting , and . The extension is tailored to include the pairwise structure of PCIVs and to simplify transformation into hypotheses on the complex matrix used in the parameterization. The parameterization of the set of matrices corresponding to is derived from a mapping of the form given in (15), with and replaced by and as in Lemma 2.
Similarly, the three other types of hypotheses on the cointegrating spaces considered above can be extended to hypotheses on the space of PCIVs in the MFI(1) case. They translate into hypotheses on complex valued matrices orthogonal to . To parameterize the set of matrices restricted according to these null hypotheses, Lemma 2 is used. Thus, the restrictions implied by the extensions of all four types of hypotheses to hypotheses on the dynamic cointegrating spaces at frequencies for MFI(1) processes can be implemented using Givens rotations.
A different case of interest is the hypothesis of at least m linearly independent CIVs , with , i.e., an m-dimensional static cointegrating space at frequency , which we discuss as another illustrative example to the procedure for the case of cointegration at complex unit roots.
For the dynamic cointegrating space, this hypothesis implies the existence of linearly independent PCIVs of the form and , . In light of the discussion above the necessary condition for these two polynomials to be PCIVs is equivalent to , for . This restriction is similar to discussed above, except for the fact that the cointegrating vectors are not fully specified. This hypothesis is equivalent to the existence of an m-dimensional real kernel of . A suitable parameterization is derived from the following mapping
where and as in Lemma 2. The difference in the number of free parameters without restrictions and with restrictions is equal to .
The hypotheses can also be tested jointly for the cointegrating spaces of several unit roots.
5.1.3. Testing Hypotheses on the Adjustment Coefficients
As in the case of hypotheses on the cointegrating spaces , hypotheses on the adjustment coefficients are typically formulated as hypotheses on the column spaces of . We only focus on hypotheses on the real valued corresponding to frequency zero. Analogous hypotheses may be considered for at frequencies , using the same ideas.
The first type of hypothesis on is of the form and therefore, can be rewritten as . W.l.o.g. let and . We deal with this type of hypothesis as with in the previous section by simply reversing the roles of and . We, therefore, consider the set of feasible matrices as a subset in and use the mapping to derive a parameterization, while is restricted to be a p.u.t. matrix and the set of feasible matrices is parameterized accordingly.
As a second type of hypothesis Juselius (2006, sct. 11.9, p. 200) discusses , , linked to the absence of permanent effects of shocks on any of the variables of the system. Assume w.l.o.g. . Using the parameterization of defined in (13) for the set of feasible matrices and the parameterization of the set of p.u.t. matrices for the set of feasible matrices , implements this restriction.
The restrictions on reduce the number of free parameters by and the restrictions implied by lead to a reduction by free parameters, compared to the unrestricted case, which matches in both cases the number of degrees of freedom of the corresponding test statistic in the VECM framework.
5.1.4. Restrictions on the Deterministic Components
Including an unrestricted constant in the VECM equation leads to a linear trend in the solution process , for . If one restricts the constant to in a general VECM equation as given in (4), with of rank r, no summation to linear trends in the solution process occurs, while a constant non-zero mean is still present in the cointegrating relations, i.e., the process . Analogously an unrestricted linear trend in the VECM equation leads to a quadratic trend of the form in the solution process, which is excluded by the restriction .
In the VECM framework, compare Johansen (1995, sct. 5.7, p. 81), five restrictions related to the coefficients corresponding to the constant and the linear trend are commonly considered:
with and and the following consequences for the solution processes: Under the solution process contains a quadratic trend in the direction of the common trends, i.e., in , and a linear trend in the direction of the cointegrating relations, i.e., in . Under the quadratic trend is not present. features a linear trend only in the directions of the common trends, a constant only in these directions. Under the constant is also present in the directions of the cointegrating relations.
In the state space framework the deterministic components can be added in the output equation , compare (9). Consequently, the above considered hypotheses can be imposed by formulating linear restrictions on . These can be directly parameterized by including the following deterministic components in the five considered cases:
where and . The component captures the influence of the initial value in the output equation.
In the VECM framework for the seasonal MFI(1) case, with of rank for , the deterministic component usually includes restricted seasonal dummies of the form , to avoid summation in the directions of the stochastic trends. The state space framework allows to straightforwardly include seasonal dummies in the output equation in the form of , . Again, it is of interest whether these components are unrestricted or whether they take the form of , , similarly allowing for a reinterpretation of these components as influence of the initial values on the output.
Please note that is equivalently given by using real coefficients and the desired restrictions can be implemented accordingly.
5.2. The Case
The state space unit root structure of I(2) processes is of the form , where the integer equals the dimension of , and equals the dimension of . Recall that the solution for and of the system in canonical form in this setting is given by
For VAR processes integrated of order two the integers and of the corresponding state space unit root structure are linked to the ranks of the matrices (denoted as ) and (denoted as ) in the VECM setting, as discussed in Section 2. It holds that and . The relation of the state space unit root structure to the cointegration indices was also discussed in Section 3.
Again, both the integers and and the ranks r and m, and consequently also the indices and , are closely related to the dimensions of the spaces spanned by CIVs and PCIVs. In the case the static cointegrating space of order is the orthocomplement of the column space of and thus of dimension . The dimension of the space spanned by CIVs of order is equal to , where denotes the rank of , since this space is the orthocomplement of the column space of . The space spanned by the PCIVs of order is of dimension smaller or equal to , due to the orthogonality constraint on given in Example 3.
Consider the matrices , and as defined in Section 2. From a state space realization in canonical form corresponding to a VAR process it immediately follows that the columns of span the same space as the columns of the sub-block . The same relation holds true for and the sub-block . With respect to polynomial cointegration, Bauer and Wagner (2012) show that the rank of determines the number of minimum degree polynomial cointegrating relations, as discussed in Example 3. If , then there exists no vector , such that is integrated and cointegrated with . In this case is a stationary process.
The deterministic components included in the I(2) setting are typically a constant and a linear trend. As in the MFI(1) case, identifiability problems occur, if we consider a non-zero initial state : The solution to the state space equations for and is given by:
Hence, if , the output equation contains the terms and . Again, this implies non-identifiability, which is resolved by assuming , compare Remark 6.
5.2.1. Testing Hypotheses on the State Space Unit Root Structure
To simplify notation we use
with . Here for denotes the closure of the set of transfer functions of order n that possess a state space unit root structure of either or in case of , while denotes the closure of the set of all stable transfer functions of order n.
Considering the relations between the different sets of transfer functions given in Corollary 4 shows that the following relations hold (assuming ; the columns are arranged to include transfer functions with the same dimension of ):
Please note that corresponds to in Johansen (1995). Therefore, the relationships between the subsets match the ones in Johansen (1995, Table 9.1) and the ones found by Jensen (2013). The latter type of inclusions appear for instance for , containing transfer functions corresponding to processes, which is a subset of the set of transfer functions corresponding to processes.
The same remarks as in the MFI(1) case also apply in the I(2) case: When testing for , all attainable state space unit root structures have to be included in the null hypothesis.
5.2.2. Testing Hypotheses on CIVs and PCIVs
Johansen (2006) discusses several types of hypotheses on the cointegrating spaces of different orders. These deal with properties of , joint properties of or the occurrence of non-trivial polynomial cointegrating relations. Boswijk and Paruolo (2017), moreover, discuss testing hypotheses on the loading matrices of common trends (corresponding in our setting to testing hypotheses on ).
We commence with hypotheses of the form and just as in the MFI(1) case at unit root one, since hypotheses on correspond to hypotheses on its orthocomplement spanned by in the VARMA framework:
Hypotheses of the form imply . W.l.o.g. let and . As in the parameterization under in the MFI(1) case at unit root one, compare (15), use the mapping
to derive a parameterization of the set of feasible matrices , i.e., a joint parameterization of both sets of matrices and , where .
Hypotheses of the form are equivalent to . Assume w.l.o.g. and parameterize the set of feasible matrices using as defined in (13) and the set of feasible matrices using . Alternatively, parameterize the set of feasible matrices jointly as elements .
Applications using the VECM framework allow for testing hypotheses on . In the VARMA framework, these correspond to hypotheses on the orthogonal complement of , i.e., . Implementation of different types of hypotheses on proceeds as for similar hypotheses on in the MFI(1) case at unit root one, replacing by .
The hypothesis of no minimum degree polynomial cointegrating relations implies the restriction , compare Example 3. Therefore, we can test all hypotheses considered in Johansen (2006) also in our more general setting.
5.2.3. Testing Hypotheses on the Adjustment Coefficients
Hypotheses on and as defined in (6) and (7) correspond to hypotheses on the spaces spanned by the rows of and . For VAR processes integrated of order two, the row space of is equal to the orthogonal complement of the column space of , while the row space of is equal to the orthogonal complement of the column space of . The restrictions corresponding to hypotheses on and can be implemented analogously to the restrictions corresponding to hypotheses on in Section 5.1.3, reversing the roles of the relevant sub-blocks in and accordingly.
5.2.4. Restrictions on the Deterministic Components
The I(2) case is, with respect to the modeling of deterministic components, less well studied than the MFI(1) case. In most theory papers they are simply left out, with the notable exception Rahbek et al. (1999), dealing with the inclusion of a constant term in the I(2)-VECM representation. The main reason for this appears to be the way deterministic components in the defining vector error correction representation translate into deterministic components in the corresponding solution process. An unrestricted constant in the VECM for I(2) processes leads to a linear trend in and a quadratic trend in , while an unrestricted linear trend results in quadratic and cubic trends in the respective directions. Already in the I(1) case discussed above five different cases—with respect to integration and asymptotic behavior of estimators and tests—need to be considered separately. An all encompassing discussion of the restrictions on the coefficients of a constant and a linear trend in the I(2) case requires the specification of even more cases. As an alternative approach in the VECM framework, deterministic components could be dealt with by replacing with in the VECM equation. This has recently been considered in Johansen and Nielsen (2018) and is analogous to our approach in the state space framework.
As before, in the MFI(1) or I(1) case, the analysis of (the impact of) deterministic components is straightforward in the state space framework, which effectively stems from their additive inclusion in the Granger-type representation, compare (9). Choose, e.g., , as in the I(1) case. In analogy to Section 5.1.4, linear restrictions of deterministic components in relation to the static and polynomial cointegrating spaces can be embedded in a parameterization. Focusing on , e.g., this is achieved by
where the columns of are a basis for the column space of , which does not necessarily have full column rank, and the columns of span the orthocomplement of the column space of . The matrix can be decomposed analogously. The corresponding parametrization then allows to consider different restricted versions of deterministic components and to study the asymptotic behavior of estimators and tests for these cases.
6. Summary and Conclusions
Vector autoregressive moving average (VARMA) processes, which can be cast equivalently in the state space framework, may be useful for empirical analysis compared to the more restrictive class of vector autoregressive (VAR) processes for a variety of reasons. These include invariance with respect to marginalization and aggregation, parsimony as well as the fact that the log-linearized solutions to DSGE models are typically VARMA processes rather than VAR processes. To realize the potential of these advantages necessitates, in our view, to develop cointegration analysis for VARMA processes to a similar extent as it is developed for VAR processes. The necessary first steps of this research agenda are to develop a set of structure theoretical results that allow subsequently developing statistical inference procedures. Bauer and Wagner (2012) provides the very first step of this agenda by providing a canonical form for unit root processes in the state space framework, which is shown in that paper to be very convenient for cointegration analysis.
Based on the earlier canonical form paper this paper derives a state space model parameterization for VARMA processes with unit roots using the state space framework. The canonical form and a fortiori the parameterization based on it are constructed to facilitate the investigation of the unit root and (static and polynomial) cointegration properties of the considered process. Furthermore, the paper shows that the framework allows to test a large variety of hypotheses on cointegrating ranks and spaces, clearly a key aspect for the usefulness of any method to analyze cointegration. In addition to providing general results, throughout the paper all results are discussed in detail for the multiple frequency I(1) and I(2) cases, which cover the vast majority of applications.
Given the fact that (as shown in Hazewinkel and Kalman 1976) VARMA unit root processes cannot be continuously parameterized, the set of all unit root processes (as defined in this paper) is partitioned according to a multi-index that includes the state space unit root structure. The parameterization is shown to be a diffeomorphism on the interior of the considered sets. The topological relationships between the sets forming the partitioning of all transfer functions considered are studied in great detail for three reasons: First, pseudo maximum likelihood estimation effectively amounts to maximizing the pseudo likelihood function over the closures of sets of transfer functions, in our notation. Second, related to the first item, the relations between subsets of have to be understood in detail as knowledge concerning these relations is required for developing (sequential) pseudo likelihood-ratio tests for the numbers of stochastic trends or cycles. Third, of particular importance for the implementation of, e.g., pseudo maximum likelihood estimators, we discuss the existence of generic pieces.
In this respect we derive two results: First, for correctly specified state space unit root structure and system order of the stable subsystem —and thus correctly specified system order—we explicitly describe generic indices such that is open and dense in the set of all transfer functions with state space unit root structure and system order of the stable subsystem . This result forms the basis for establishing consistent estimators of the transfer functions—and via continuity of the parameterization—of the parameter estimators when the state space unit root structure and system order are known. Second, in case only an upper bound on the system order is known (or specified), we show the existence of a generic multi-index for which the set of corresponding transfer functions is open and dense in the set of all non-explosive transfer functions whose order (or McMillan degree) is bounded by n. This result is the basis for consistent estimation (on an open and dense subset) when only an upper bound of the system order is known. In turn this estimator is the starting point for determining , using the subset relationships alluded to above in the second point. For the MFI(1) and I(2) cases we show in detail that similar subset relations (concerning cointegrating ranks) as in the cointegrated VAR MFI(1) and I(2) cases hold, which suggests constructing similar sequential test procedures for determining the cointegrating ranks as in the VAR cointegration literature.
Section 5 is devoted to a detailed discussion of testing hypotheses on the cointegrating spaces, again for both the MFI(1) and the I(2) case. In this section, particular emphasis is put on modeling deterministic components. The discussion details how all usually formulated and tested hypotheses concerning (static and polynomial) cointegrating vectors, potentially in combination with (un-)restricted deterministic components, in the VAR framework can also be investigated in the state space framework.
Altogether, the paper sets the stage to develop pseudo maximum likelihood estimators, investigate their asymptotic properties (consistency and limiting distributions) and tests based on them for determining cointegrating ranks that allow performing cointegration analysis for cointegrated VARMA processes. The detailed discussion of the MFI(1) and I(2) cases benefits the development of statistical theory dealing with these cases undertaken in a series of companion papers.
Author Contributions
The authors of the paper have contributed equally, via joint efforts, regarding both ideas, research, and writing. Conceptualization, all authors; methodology, all authors; formal analysis, P.d.M.R. and L.M.; investigation, all authors; writing—original draft preparation, P.d.M.R. and L.M.; writing—review and editing, all authors.; project administration, D.B. and M.W.; funding acquisition, D.B. and M.W. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation-Projektnummer 276051388) which is gratefully acknowledged. We acknowledge support for the publication costs by the Deutsche Forschungsgemeinschaft and the Open Access Publication Fund of Bielefeld University.
Acknowledgments
We thank the editors, Rocco Mosconi and Paolo Paruolo, as well as anonymous referees for helpful suggestions. The views expressed in this paper are solely those of the authors and not necessarily those of the Bank of Slovenia or the European System of Central Banks. On top of this the usual disclaimer applies.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A. Proofs of the Results of Section 3
Appendix A.1. Proof of Lemma 1
- (i)
- Let be a sequence in converging to for . By continuity of matrix multiplicationThus, , which shows that is closed. By construction . Since for all and , the entries of C are bounded.
- (ii)
- By definition is a product of matrices whose elements are either constant or infinitely often differentiable functions of the elements of .
- (iii)
- The algorithm discussed above Lemma 1 maps every to . Since for all and , C can be obtained by multiplying with the transposed Givens rotations.
- (iv)
- As discussed, is obtained from a repeated application of the algorithm described in Remark 10. In each step two entries are transformed to polar coordinates. According to Amann and Escher (2008, chp. 8, p. 204) the transformation to polar coordinates is infinitely often differentiable with infinitely often differentiable inverse for (and hence ), i.e., on the interior of the interval . Thus, is a concatenation of functions which are infinitely often differentiable on the interior of and is thus infinitely often differentiable, if for all components of .Clearly, the interior of is open and dense in . By the definition of continuity the pre-image of the interior of is open in . By (iii) there exists a for arbitrary such that . Since the interior of is dense in there exists a sequence in the interior of such that . Then because of the continuity of . Since is a sequence in the pre-image of the interior of , it follows that the pre-image of the interior of is dense in .
- (v)
- For any it holds that and , which implies . Since the determinant is a continuous function on square matrices, both sets and are disjoint and closed.
- (vi)
- The proof proceeds analogously to the proof of (iii).
- (vii)
- A function defined on two disjoint subsets is infinitely often differentiable if and only if the two functions restricted to the subsets are infinitely often differentiable. The same arguments as used in (iv) together with the results in (ii) imply that and are infinitely often differentiable with infinitely often differentiable inverse on an open subset of . Clearly, the multiplication with is infinitely often differentiable with infinitely often differentiable inverse, which implies that is infinitely often differentiable with infinitely often differentiable inverse on an open subset of , from which the result follows.
Appendix A.2. Proof of Lemma 2
- (i)
- Let be a sequence in converging to for . By continuity of matrix multiplicationThus, , which shows that is closed. By construction . Since for all and , the entries of C are bounded.
- (ii)
- By definition is a product of matrices whose elements are either constant or infinitely often differentiable functions of the elements of .
- (iii)
- The algorithm discussed above Lemma 2 maps every to with . Since for all and , C can be obtained by multiplying with the transposed Givens rotations.
- (iv)
- The algorithms in Remark 12 and above Lemma 2 describe in detail. The determination of an element of or uses the transformation of two complex numbers into polar coordinates in step 2 of Remark 12, which according to Amann and Escher (2008, chp. 8, p. 204) is infinitely often differentiable with infinitely often differentiable inverse except for non-negative reals, which are the complement of an open and dense subset of the complex plane. Step 3 of Remark 12 uses the formulas , which is infinitely often differentiable for , and , which is infinitely often differentiable for , which occurs on an open and dense subset of . For the determination of an element of a complex number of modulus one is transformed in polar coordinates which is infinitely often differentiable on an open and dense subset of complex numbers of modulus one compare again Amann and Escher (2008, chp. 8, p. 204). Thus, is a concatenation of functions which are infinitely often differentiable on open and dense subsets of their domain of definition and is thus infinitely often differentiable on an open and dense subset of .
Appendix A.3. Proof of Theorem 2
- (i)
- The multi-index is unique for a transfer function , since it only contains information encoded in the canonical form. Therefore, is well defined. Since conversely for every transfer function a multi-index can be found, constitutes a partitioning of . Furthermore, using the canonical form, it is straightforward to see that the mapping attaching the triple in canonical form to a transfer function is homeomorphic (bijective, continuous, with continuous inverse): Bijectivity is a consequence of the definition of the canonical form. continuity of the transfer function as a function of the matrix triples is obvious from the definition of . Continuity of the inverse can be shown by constructing the canonical form starting with an overlapping echelon form (which is continuous according to Hannan and Deistler 1988, chp. 2) and subsequently transforming the state basis to reach the canonical form. This involves the calculation of a Jordan normal form with fixed structure. This is an analytic mapping (cf. Chatelin 1993, Theorem 4.4.3). Finally, the restrictions on C and B are imposed. For given multi-index these transformations are continuous (as discussed above they involve QR decompositions to obtain unitary block columns for the blocks of C, rotations to p.u.t form with fixed structure for the blocks of B and transformations to echelon canonical form for the stable part).
- (ii)
- The construction of the triple for given and is straightforward: is uniquely determined by . Since contains the entries of restricted to be positive and contains the free parameters of , the mapping is continuous. The mapping is continuous (cf. Hannan and Deistler 1988, Theorem 2.5.3 (ii). The mapping consists of iterated applications of and (compare Lemmas 1 and 2) which are differentiable and thus continuous and iterated applications of the extensions of the mappings and (compare Corollaries 1 and 2) to general unit root structures and to complex matrices. The proof that these functions are differentiable is analogous to the proofs of Lemma 1 and Lemma 2.
- (iii)
- The definitions of and immediately imply that they depend continuously on . The parameter vector depends continuously on (cf. Hannan and Deistler 1988, Theorem 2.5.3 (ii)). The existence of an open and dense subset of matrices such that the mapping attaching parameters to the matrices is continuous follows from arguments contained in the proofs of Lemmas 1 and 2.
Appendix B. Proofs of the Results of Section 4
Appendix B.1. Proof of Theorem 3
For the first inclusion the proof can be divided into two parts, discussing the stable and the unstable subsystem separately. The result with regard to the stable subsystem is due to Hannan and Deistler (1988, Theorem 2.5.3 (iv)). For the unstable subsystem implies the existence of a matrix S as described in Definition 9. Partition such that . Let be an arbitrary transfer function in with corresponding state space realization . Then, we find matrices and such that for the state space realization given by , and it holds that . Then, , where is the number of rows of for converges for to , which is observationally equivalent to . Consequently, .
To show the second inclusion, consider a sequence of systems converging to . We need to show , where is the multi-index corresponding to .
For the stable system we can separate the subsystem remaining stable in the limit and the part with eigenvalues of tending to the unit circle. As discussed in Section 4.1.2, converges to the stable subsystem whose Kronecker indices can only be smaller than or equal to (cf. Hannan and Deistler 1988, Theorem 2.5.3).
The remaining subsystem consists of the unstable subsystem of which converges to and the second part of the stable subsystem containing all stable eigenvalues of converging to the unit circle. The limiting combined subsystem is such that is block diagonal. If the limiting combined subsystem is minimal and has a structure corresponding to p, this shows that the pair extends in accordance with the definition of .
Since the limiting subsystem is not necessarily minimal and has not necessarily a structure corresponding to p, eliminating coordinates of the state and adapting the corresponding structure indices p may result in a pair that is smaller than the pair corresponding to an element of .
Appendix B.2. Proof of Theorem 4
The multi-index contains three components: . For given the selection of the structures indices introducing the fewest restrictions, such that in its boundary all possible p.u.t. matrices occur, was discussed in Section 4.2. Choosing this maximal element then implies that all systems of given state space unit root structure correspond to a multi-index that is smaller than or equal to , where is a Kronecker index corresponding to state space dimension . For the Kronecker indices of order it is known that there exists one index such that is open and dense in . The set is, therefore, contained in which implies (14) with .
For the second claim choose an arbitrary state space realization in canonical form such that for arbitrary . Define the sequence by , , . Then holds for all j, which implies for every and every j. The continuity of implies .
Appendix B.3. Proof of Theorem 5
- (i)
- Assume that there exists a sequence converging to a transfer function . For such a sequence the size of the Jordan blocks for every unit root are identical from some onwards since eigenvalues depend continuously on the matrices (cf. Chatelin 1993): Thus, the stable part of the transfer functions must converge to the stable part of the transfer function , since the sum of the algebraic multiplicity of all eigenvalues inside the open unit disc cannot drop in the limit. Since (the set of all stable transfer functions with Kronecker index ) is open in according to Hannan and Deistler (1988, Theorem 2.5.3) this implies that the stable part of has Kronecker index from some onwards.For the unstable part of the transfer function note that in for every unit root the rank of is equal for every r. Thus, the maximum over cannot be larger due to lower semi-continuity of the rank. It follows that for the ranks of for all and for all are identical to the ranks corresponding to from some point onwards showing that has the same state space unit root structure as from some onwards. Finally, the p.u.t. structure of sub-blocks of clearly introduces an open set being defined via strict inequalities. This shows that from some onwards implying that is open in .
- (ii)
- The first inclusion was shown in Theorem 3. Comparing Definitions 10 and 11 we see . By the definition of the partial ordering (compare Definition 9) holds. Together these two statements imply the second inclusion.is a consequence of the following two statements:
- (a)
- If , then .
- (b)
- If , then .
For (a) note that for an arbitrary transfer function with there is a multi-index such that . By the definition of the partial ordering (compare Definition 9) we find a multi-index such that . By Theorem 3 and the continuity of we have . Since by assumption, which finishes the proof of (a).With respect to (b) note that by Definition 11, contains transfer functions with two types of state space unit root structures. First, corresponding to state space unit root may be of the formSecond, corresponding to state space unit root may be of the form (A1) where off-diagonal elements of are replaced by zero. To prove (b) we need to show that for both cases the corresponding transfer function is contained in .We start by showing that in the second case the transfer function is contained in , where is the state space unit root structure corresponding to in (A1). For this, consider the sequenceClearly, every system corresponds to an process, while the limit for corresponds to an process. This shows that it is possible in the limit to trade one component with two components leading to more transfer functions in the closure of than only the ones included in , where the off-diagonal entry in is restricted to equal one and hence the corresponding sequence of systems in the canonical form diverges to infinity. In a sense these systems correspond to “points at infinity”: For the example given above we obtain the canonical formThus, the corresponding parameter vector for the entries in converges to zero and the ones corresponding to to infinity.Generalizing this argument shows that every transfer function corresponding to a pair in , where can be obtained by replacing off-diagonal entries of with zero, can be reached from within .To prove in the first case, where the state space unit root structure is extended as visible in Equation (A1), consider the sequence:corresponding to the following system in canonical form (except that the stable subsystem is not necessarily in echelon canonical form)This sequence shows that there exists a sequence of transfer functions corresponding to processes with one common trend that converge to a transfer function corresponding to an system. Again, in the canonical form this cannot happen as there the entry of would be restricted to be equal to zero. At the same time note that the dimension of the stable system is reduced due to one component of the state changing from the stable to the unit root part.Now for a unit root structure such that , satisfyingthe Jordan blocks corresponding to are sub-blocks of the ones corresponding to , potentially involving a reordering of coordinates using the permutation matrix S. Taking as the approximating sequence of transfer functions that have the same structure but replacing by leads to processes with state space unit root structure .For the stable part of we can separate the part containing poles tending to the unit circle (contained in ) and the remaining transfer function , which has Kronecker indices . However, the results of Hannan and Deistler (1988, Theorem 2.5.3) then imply that the limit remains in and hence allows for an approximating sequence in .Both results combined constitute the whole set of attainable state space unit root structures in Definition 11 and prove (b).As follows from Corollary 4, . Thus, (b) implies and (a) adds the second union showing the subset inclusion.It remains to show equality for the last set inclusion. Thus, we need to show that for , it holds that , where . To this end note that the rank of a matrix is a lower semi-continuous function such that for a sequence of matrices with limit , we haveThen, consider a sequence . We can find a converging sequence of systems realizing . Therefore, choosing we obtain thatsince implies that the number of the generalized eigenvalues at the unit roots is governed by the entries of the state space unit root structure . This implies that for . Consequently, the limit has at least as many chains of generalized eigenvalues of each maximal length as dictated by the state space unit root structure for each unit root of the limiting system.Rearranging the rows and columns of the Jordan normal form using a permutation matrix S it is then obvious that either the limiting matrix has additional eigenvalues, where thusmust hold. Or upper diagonal entries in must be changed from ones to zeros in order to convert some of the chains to lower order. One example in this respect was given above: For the rank of is equal to 1 for and 0 for . For the limit we obtain and hence the rank is zero for . The corresponding indices are for the approximating sequence and for the limit respectively. Summing these indices starting from the last one, one obtains and .Hence the state space unit root structure corresponding to must be attainable according to Definition 11. The number of stable state components must decrease accordingly.Finally, the limiting system is potentially not minimal. In this case the pair is reduced to a smaller one, concluding the proof.
References
- Amann, Herbert, and Joachim Escher. 2008. Analysis III. Basel: Birkhäuser Basel. [Google Scholar]
- Aoki, Massanao. 1990. State Space Modeling of Time Series. New York: Springer. [Google Scholar]
- Bauer, Dietmar, and Martin Wagner. 2003. On Polynomial Cointegration in the State Space Framework. Mimeo. [Google Scholar] [CrossRef]
- Bauer, Dietmar, and Martin Wagner. 2005. Autoregressive Approximations of Multiple Frequency I(1) Processes. IHS Economics Series, Institut für Höhere Studien–Institute for Advanced Studies (IHS) Vienna, No. 174. Available online: http://hdl.handle.net/10419/72306 (accessed on 3 November 2020).
- Bauer, Dietmar, and Martin Wagner. 2012. A State Space Canonical Form for Unit Root Processes. Econometric Theory 28: 1313–49. [Google Scholar] [CrossRef]
- Boswijk, H. Peter, and Paolo Paruolo. 2017. Likelihood Ratio Tests of Restrictions on Common Trends Loading Matrices in I(2) VAR Systems. Econometrics 5: 28. [Google Scholar] [CrossRef]
- Campbell, John Y. 1994. Inspecting the Mechanism: An Analytical Approach to the Stochastic Growth Model. Journal of Monetary Economics 33: 463–506. [Google Scholar] [CrossRef]
- Chatelin, Françoise. 1993. Eigenvalues of Matrices. New York: John Wiley & Sons. [Google Scholar]
- Engle, Robert F., and Clive W.J. Granger. 1987. Cointegration and Error Correction: Representation, Estimation and Testing. Econometrica 55: 251–76. [Google Scholar] [CrossRef]
- Golub, Gene H., and Charles F. van Loan. 1996. Matrix Computations, 3rd ed. Baltimore: The Johns Hopkins University Press. [Google Scholar]
- Granger, Clive W.J. 1981. Some Properties of Time Series Data and Their Use in Econometric Model Specification. Journal of Econometrics 16: 121–30. [Google Scholar] [CrossRef]
- Hannan, Edward J., and Manfred Deistler. 1988. The Statistical Theory of Linear Systems. New York: John Wiley & Sons. [Google Scholar]
- Hazewinkel, Michiel, and Rudolf E. Kalman. 1976. Invariants, Canonical Forms and Moduli for Linear, Constant, Finite Dimensional, Dynamical Systems. In Mathematical Systems Theory. Edited by Giovanni Marchesini and Sanjoy Kumar Mitter. Berlin: Springer, chp. 4. pp. 48–60. [Google Scholar]
- Jensen, Andreas N. 2013. The Nesting Structure of the Cointegrated Vector Autoregressive Models. Paper presented at the QED Conference 2013, Vienna, Austria, May 3–4. [Google Scholar]
- Johansen, Søren. 1991. Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian Vector Autoregressive Models. Econometrica 59: 1551–80. [Google Scholar] [CrossRef]
- Johansen, Søren. 1995. Likelihood-Based Inference in Cointegrated Vector Auto-Regressive Models. Oxford: Oxford University Press. [Google Scholar]
- Johansen, Søren. 1997. Likelihood Analysis of the I(2) Model. Scandinavian Journal of Statistics 24: 433–62. [Google Scholar] [CrossRef]
- Johansen, Søren. 2006. Statistical Analysis of Hypotheses on the Cointegrating Relations in the I(2) Model. Journal of Econometrics 132: 81–115. [Google Scholar] [CrossRef]
- Johansen, Søren, and Morton ∅. Nielsen. 2018. The Cointegrated Vector Autoregressive Model with General Deterministic Terms. Journal of Econometrics 202: 214–29. [Google Scholar] [CrossRef]
- Johansen, Søren, and Ernst Schaumburg. 1999. Likelihood Analysis of Seasonal Cointegration. Journal of Econometrics 88: 301–39. [Google Scholar] [CrossRef]
- Juselius, Katarina. 2006. The Cointegrated VAR Model: Methodology and Applications. Oxford: Oxford University Press. [Google Scholar]
- Lewis, Richard, and Gregory C. Reinsel. 1985. Prediction of Multivariate Time Series by Autoregressive Model Fitting. Journal of Multivariate Analysis 16: 393–411. [Google Scholar] [CrossRef]
- Otto, Markus. 2011. Rechenmethoden für Studierende der Physik im ersten Jahr. Heidelberg: Spektrum Akademischer Verlag. [Google Scholar]
- Paruolo, Paolo. 1996. On the Determination of Integration Indices in I(2) Systems. Journal of Econometrics 72: 313–56. [Google Scholar] [CrossRef]
- Paruolo, Paolo. 2000. Asymptotic Efficiency of the Two Stages Estimator in I(2) Systems. Econometric Theory 16: 524–50. [Google Scholar] [CrossRef]
- Poskitt, Donald S. 2006. On the Identification and Estimation of Nonstationary and Cointegrated ARMAX Systems. Econometric Theory 22: 1138–75. [Google Scholar] [CrossRef]
- Rahbek, Anders, Hans C. Kongsted, and Clara Jorgensen. 1999. Trend Stationarity in the I(2) Cointegration Model. Journal of Econometrics 90: 265–89. [Google Scholar] [CrossRef]
- Saikkonen, Pentti. 1992. Estimation and Testing of Cointegrated Systems by an Autoregressive Approximation. Econometric Theory 8: 1–27. [Google Scholar] [CrossRef]
- Saikkonen, Pentti, and Ritva Luukkonen. 1997. Testing Cointegration in Infinite Order Vector Autoregressive Processes. Journal of Econometrics 81: 93–126. [Google Scholar] [CrossRef]
- Sims, Christopher A., James H. Stock, and Mark W. Watson. 1990. Inference in Linear Time Series Models with Some Unit Roots. Econometrica 58: 113–44. [Google Scholar] [CrossRef]
- Wagner, Martin. 2018. Estimation and Inference for Cointegrating Regressions. In Oxford Research Encyclopedia of Economics and Finance. Oxford: Oxford University Press. [Google Scholar]
- Wagner, Martin, and Jaroslava Hlouskova. 2009. The Performance of Panel Cointegration Methods: Results from a Large Scale Simulation Study. Econometric Reviews 29: 182–223. [Google Scholar] [CrossRef]
- Zellner, Arnold, and Franz C. Palm. 1974. Time Series Analysis and Simultaneous Equation Econometric Models. Journal of Econometrics 2: 17–54. [Google Scholar] [CrossRef]
| 1 | Please note that the original contribution to the estimation of cointegrating relationship has been least squares estimation in a non- or semi-parametric regression setting, see, e.g., Engle and Granger (1987). A recent survey of regression-based cointegration analysis is provided by Wagner (2018). |
| 2 | The complexity of these inter-relations is probably well illustrated by the fact that only Jensen (2013) notes that “even though the I(2) models are formulated as submodels of I(1) models, some I(1) models are in fact submodels of I(2) models”. |
| 3 | The literature often uses VAR models as approximations, based on the fact that VARMA processes often can be approximated by VAR models with the order tending to infinity with the sample size at certain rates. This line of work goes back to Lewis and Reinsel (1985) for stationary processes and was extended to (co)integrated processes by Saikkonen (1992), Saikkonen and Luukkonen (1997) and Bauer and Wagner (2005). In addition to the issue of the existence and properties of a sequence of VAR approximations, the question whether a VAR approximation is parsimonious remains. |
| 4 | Below we often use the term “likelihood” as short form of “likelihood function”. |
| 5 | We are confident that this dual usage of notation does not lead to confusion. |
| 6 | Our definition of VAR processes differs to a certain extent from some widely used definitions in the literature. Given our focus on unit root and cointegration analysis we, unlike Hannan and Deistler (1988), allow for determinantal roots at the unit circle that, as is well known, lead to integrated processes. We also include deterministic components in our definition, i.e., we allow for a special case of exogenous variables, compare also Remark 2 below. There is, however, also a large part of the literature that refers to this setting simply as (cointegrated) vector autoregressive models, see, e.g., Johansen (1995) and Juselius (2006). |
| 7 | Of course, the statistical properties of the parameter estimators depend in many ways on the deterministic components. |
| 8 | The set is endowed with the pointwise topology, defined in Section 3. For now, in the context of VAR models, it suffices to know that convergence in pointwise topology is equivalent to convergence of the VAR coefficient matrices in the Frobenius norm. |
| 9 | Please note that in case of restricted estimation, i.e., zero restrictions or cross-equation restrictions, OLS is not asymptotically equivalent to PML in general. |
| 10 | A similar property holds for being a “thin” subset of . This implies that the probability that the OLS estimator calculated over corresponds to an element is equal to zero in general. |
| 11 | Below Example 3 we clarify how these indices are related to the state space unit root structure defined in Bauer and Wagner (2012, Definition 2) and link these to the dimensions of the cointegrating spaces in Section 5.2. |
| 12 | Uniqueness of realizations in the VAR case stems from the normalization , which reduces the class of observationally equivalent VAR realizations of the same transfer function , with , to a singleton. |
| 13 | The pair is left coprime if all its left divisors are unimodular matrices. Unimodular matrices are polynomial matrices with constant non-zero determinant. Thus, pre-multiplication of, e.g., with a unimodular matrix does not affect the determinantal roots that shape the dynamic behavior of the solutions of VAR models. |
| 14 | When using the echelon canonical form, the partitioning is according to the so-called Kronecker indices related to a basis selection for the row-space of the Hankel matrix corresponding to the transfer function , see, e.g., Hannan and Deistler (1988, chp. 2.4) for a precise definition. |
| 15 | Here and below we will only consider state space systems in so-called innovation representation, with the same error in both the output equation and the state equation. Since every state space system has an innovation representation this is no restriction, compare Aoki (1990, chp. 7.1). |
| 16 | The definition of cointegrating spaces as linear subspaces allows to characterize them by a basis and implies a well-defined dimension. These advantages, however, have the implication that the zero vector is an element of all cointegrating spaces, despite not being a cointegrating vector in our definition, where the zero vector is excluded. This issue is well-known of course in the cointegration literature. |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).