Cointegration and Error Correction Mechanisms for Singular Stochastic Vectors
Università di Bologna, Department of Economics, 40126 Bologna, Italy
Einaudi Institute for Economics and Finance, 00187 Roma, Italy
Federal Reserve Board of Governors, Washington, DC 20551, USA
Author to whom correspondence should be addressed.
Econometrics 2020, 8(1), 3; https://doi.org/10.3390/econometrics8010003
Received: 28 March 2018 / Revised: 30 December 2019 / Accepted: 7 January 2020 / Published: 4 February 2020
(This article belongs to the Special Issue Celebrated Econometricians: Katarina Juselius and Søren Johansen)
Large-dimensional dynamic factor models and dynamic stochastic general equilibrium models, both widely used in empirical macroeconomics, deal with singular stochastic vectors, i.e., vectors of dimension r which are driven by a q-dimensional white noise, with
. The present paper studies cointegration and error correction representations for an singular stochastic vector . It is easily seen that is necessarily cointegrated with cointegrating rank . Our contributions are: (i) we generalize Johansen’s proof of the Granger representation theorem to singular vectors under the assumption that has rational spectral density; (ii) using recent results on singular vectors by Anderson and Deistler, we prove that for generic values of the parameters the autoregressive representation of has a finite-degree polynomial. The relationship between the cointegration of the factors and the cointegration of the observable variables in a large-dimensional factor model is also discussed.
An r-dimensional stochastic vector such that where the matrices are and is a q-dimensional white noise, with , is said to be singular. Singular stochastic vectors have been systematically analyzed in a number of papers starting with (Anderson and Deistler 2008a, 2008b). A motivation for studying the consequences of singularity, as argued by these authors, is that the factors’ vector in large-dimensional dynamic factor models (DFM), such as those introduced in Forni et al. (2000); Forni and Lippi (2001), (Stock and Watson 2002a, 2002b), is typically singular. Singularity is also an important feature of dynamic stochastic general equilibrium models (DSGE), see e.g., Sargent (1989), Canova (2007), pp. 230–2. Singularity as it arises in DFMs is presented in some detail below.
DFMs are based on the idea that all the observed variables in an economic system are driven by a few common (macroeconomic) shocks and by idiosyncratic components which may result from measurement errors and sectoral or regional shocks. Formally, each variable in the n-dimensional dataset , , is decomposed into the sum of a common component , and an idiosyncratic component : , where and are orthogonal for all . In the standard version of the DFM the common components are linear combinations of an r-dimensional vector of common factors ,
Now suppose that the observable variables and the common factors are and thatwhere is a nonsingular q-dimensional white-noise vector1, the common shocks. A number of papers analyzing macroeconomic databases find strong empirical support for the assumption that the vector is singular, i.e., that . See, for US datasets, Giannone et al. (2005); Amengual and Watson (2007); Forni and Gambetti (2010); Luciani (2015). For a Euro-area dataset, see Barigozzi et al. (2014).
Such results can be easily understood observing that usually the static Equation (1) is just a convenient representation derived from a “primitive” set of dynamic equations linking the common components to the common shocks . As a simple example, suppose that the variables are driven by a common one-dimensional cyclical process , such that , where is scalar white noise, and that the variables load dynamically:
In this case we can set , , , , so that Equations (1) and (2) take the formrespectively. Here and so that is singular. For a general analysis of the relationship between representation (1) and “deeper” dynamic representations like (3), see e.g., Forni et al. (2009); Stock and Watson (2016).
Now suppose that the factors have been estimated. Obtaining and the impulse-response functions of the variables with respect to (or structural shocks obtained by a linear transformation of ) requires the estimation of a VAR for the singular vector . On the other hand, the latter is necessarily cointegrated with cointegration rank c at least equal to (the rank of the spectral density of does not exceed q at all frequencies and, therefore, at frequency zero).
Singular vectors of factors in an DFM and singular vectors in DSGE models provide strong motivation for studying singular vectors in a general time-series context. The main contributions of the paper are:
- A generalization of Johansen’s proof of the Granger Representation Theorem (from MA to AR), this is Proposition 2. Consider an singular vector , with dimension r, rank , and cointegrating rank . Assuming that has an ARMA structure, and that some simple additional conditions hold, has a representation as a vector error correction mechanism (VECM) with c error correction terms:
- Assuming that the parameters of and may vary in an open subset of , see Section 3.2 for the definition of , in Proposition 3 we show that all the assumptions used to obtain (4), and also the assumption that unity is the only possible zero of , hold for generic values of the parameters. This implies that the matrices and are generically of finite degree, which is obviously not the case for nonsingular vectors.2
The paper is organized as follows. Section 2 is preliminary. We firstly recall recent results for stationary singular stochastic vectors with rational spectral density, see (Anderson and Deistler 2008a, 2008b). Secondly, we discuss cointegration and the cointegrating rank for singular stochastic vectors.
In Section 3 we prove our main results. We also obtain the permanent-transitory shock representation in the singular case: is driven by permanent shocks, i.e., r minus the cointegrating rank, the usual result. However, the number of transitory shocks is , not c as in the nonsingular case.
Section 3 also contains an exercise carried out with simulated singular vectors. We compare the results obtained by estimating an unrestricted VAR in the levels and a VECM. Though limited to a simple example, the results confirm what has been found for nonsingular vectors, that under cointegration the long-run features of impulse-response functions are better estimated using a VECM rather than an unrestricted VAR in the levels (Phillips 1998).
In Section 4 we analyse cointegration of the observable variables in a DFM. Our results on cointegration of the singular vector have the implication that p-dimensional subvectors of the n-dimensional common-component vector , with , are cointegrated. As a consequence, stationarity of the idiosyncratic components would imply that all p-dimensional subvectors of the n-dimensional dataset are cointegrated if . For example, if and , then all 3-dimensional subvectors in the dataset are cointegrated, a kind of regularity that we do not observe in actual large macroeconomic datasets. This suggests that an estimation strategy robust to the assumption that the idiosyncratic components can be has to be preferred (Barigozzi et al. 2019, for this aspect we refer to). Section 5 concludes. Some proofs, a discussion of some non-uniqueness problems arising with singularity and details on the simulations are collected in the Appendix.
2. Stationary and Singular Vectors
2.1. Stationary Singular Vectors
As in this paper we only consider representation issues it is convenient to assume that all stochastic processes are defined for . Accordingly, the lag operator L is defined as for (Bauer and Wagner (2012) also study and cointegrated processes for ).
We start by introducing results on singular vectors with an ARMA structure from (Anderson and Deistler 2008a, 2008b). Some preliminary definitions are needed.
(Zeros and Poles)
(A) When considering matrices whose entries are rational functions of we always assume that numerator and denominator of each entry have no common roots. If is an matrix of rational functions, we say that is a pole of if it is a pole of some entry of .
(B) Suppose that is an matrix whose entries are polynomial functions of , with . We say that is a zero of if rank, and that is zeroless if it has no zeros, i.e., rank for all .
With a minor abuse of language, we may speak of zeros and poles of the corresponding matrix . When a polynomial matrix has all its zeros outside the unit circle we say that is stable.
All the stationary vector processes considered have an ARMA structure. Precisely, the r-dimensional process has an ARMA structure with rank q, , if there exist
- a non-singular q-dimensional white-noise process ,
- an stable polynomial matrix , with ,
- an matrix whose rank is q for all z with the exception of a finite subset of , such that
Suppose that has also the representation , where is a -dimensional nonsingular white noise. Denoting by the spectral density of ,so that the rank of is q for all , with the exception of a finite subset of . As the spectral density is independent of the ARMA representation, and has rank q except for a finite subset of .
Let us recall that the equation
in the unknown vector process , where is stable, has only one stationary solution, and this is Thus the ARMA process can also be defined as the stationary solution of .
(Genericity)Suppose that a statement Q depends on , where is an open subset of . We say that Q holds generically in , or that Q holds for generic values of , if the subset of where it does not hold is nowhere dense in , i.e., the closure of in has no internal points.
For example, assuming that , the statement “The roots of the polynomial are distinct” holds generically in .
(Rational reduced-rank family of filters)Assume that and let be a set of ordered couples , where:
- is an polynomial matrix of degree .
- is an polynomial matrix of degree . .
- Denoting by the vector containing the coefficients of the entries of and , we assume that , where Π is an open subset of such that for ,(1) is stable,(2) with the exception of a finite subset of .
We say that is a rational reduced-rank family of filters with parameter set Π.
The notation , , though more rigorous, would be heavy and not really necessary. We use it only in Appendix A.1.
Assume that .
- Suppose that is an matrix polynomial in L. If is zeroless then has an finite-degree stable left inverse, i.e., there exists a finite-degree polynomial matrix such that:(a),(b) implies ,(c). Let be the stationary solution of and suppose that is zeroless. Then has a finite vector autoregressive representation (VAR) , where and is a finite-degree left inverse of .
- Assume that is the stationary solution of , where belongs to a rational reduced-rank family of filters with parameter set Π. For generic values of the parameters in Π, is zeroless so that has a finite VAR representation.
Assume that the r-dimensional vector has an ARMA structure, rank q and the moving average representation (5). If for , then belongs to the space spanned by , with , and representation (5), as well as , is called fundamental (for these definitions and results see e.g., Rozanov (1967), pp. 43–7). Note that if (5) is fundamental Note also that when , the condition that for becomes for .
Note that in Proposition 1, part (II), we do not assume that is fundamental for . However, Proposition 1, (II), states that for generic values of the matrix is zeroless and therefore is fundamental for .
2.3. Singular Vectors
To analyze cointegration and the autoregressive representations of singular non-stationary vectors let us first recall the definitions of , and cointegrated vectors. This requires some preliminary definitions and results.
We denote by the space of the square-integrable functions on the probability space . Let , , be an r-dimensional stochastic process and consider the difference equationin the unknown r-dimensional process . A solution of (6) issee e.g., Gregoir (1999), p. 439, Franchi andParuolo (2019). All the solutions of (6) are , where , , is a solution of the homogeneous equation , so that , for some r-dimensional stochastic vector , for all . We say that the process is a constant stochastic process. Obviously a constant stochastic process is weakly stationary. Its spectral measure has the jump at frequency zero. Thus has a spectral density (has an absolutely continuous spectral measure) if and only if , i.e., if and only if , where , for almost everywhere in .
(I(0), I(1) and Cointegrated vectors)
I(0).An r-dimensional ARMA with spectral density is if .
I(1).The r-dimensional vector stochastic process is if it is a solution where is an r-dimensional process. The rank of is defined as the rank of .
Assume that the r-dimensional stochastic vector is and denote by the spectral density of . The vector is cointegrated with cointegrating rank c, with , if rank.
If q is the rank of and , then , where . Thus in the singular case, , is necessarily cointegrated with cointegrating rank at least equal to .
If is and cointegrated with cointegrating rank c, there exist c linearly independent vectors , , such that the spectral density of vanishes at frequency zero. The vectors are called cointegrating vectors and the set , , a complete set of cointegrating vectors. Of course a complete set of cointegrating vectors , , can be replaced by the set , , where the vectors are c independent linear combinations of the vectors .
(I) Assume that has an ARMA structure and has the rational representation (5): . Then is if and only if .
(II) Assume has an ARMA structure and has the rational representation
The process is if and only if .
(III) If is , cointegrated and has representation (7), the cointegrating rank of is c if and only if the rank of is . Moreover is a cointegrating vector for if and only if .
(IV) Assume that is . is a cointegrating vector for if and only if a scalar stochastic variable can be determined such that is stationary with an ARMA structure.
(I) is an immediate consequence of , where is the nonsingular covariance matrix of . (II) and (III) are obtained in the same way from .
(IV) The process solves (6) with , so that, definingwe havewhere (i) the entries of are rational functions of L with no poles of modulus less or equal to unity, (ii) is a constant r-dimensional stochastic process. We have:
If is a cointegrating vector of we have , so that
Setting the process has the desired properties. Note that w has the equivalent definition . Conversely, suppose that w is such that has an ARMA structure. By (9),so that
The three terms on the left-hand side are finite and independent of t. As and is positive definite, the right-hand side diverges for unless . □
Lemma 1 shows that our definitions of and processes are equivalent to Definitions 3.2, and 3.3 in Johansen (1995), p. 35, with two minor differences: (i) our assumption of rational spectral density, (ii) the time span of the stochastic processes is in Johansen’s book, in the present paper. Also, under the assumption that has an ARMA structure, our definition of cointegration is equivalent to that in Johansen (1995), p. 37.
3. Representation Theory for Singular Vectors
In Section 3.1 we prove our generalization to singular vectors of the Granger representation theorem (from MA to AR). We closely follow the proof in Johansen (1995), Theorem 4.5, p. 55–57. In Section 3.2 we show that, under a suitable parameterization, the matrix of the autoregressive representation is generically of finite degree.
3.1. The Granger Representation Theorem (MA to AR)
Suppose that , and Let be an polynomial matrix of degree and an polynomial matrix of degree with .
If is a zero of (i.e. ) then either or .
Assumption 2 implies that the rank of is q. The next is a stronger version of Assumption 2:
If is a zero of then .
Under Assumption 1, let be a solution of the equation
We havewhere is defined in (8) and is a constant stochastic process. By Assumption 4, , so that is with cointegrating rank c, see Lemma 1, (II) and (III).
Consider the finite Taylor expansion of around :
Assumption 4 implies thatwhere is of rank , is of rank , see Lancaster and Tismenetsky (1985, p. 97, Proposition 3). The Taylor expansion above can be rewritten aswhere and is a polynomial matrix.
Let be an matrix whose columns are orthogonal to all columns of : (i) the columns of are a complete set of cointegrating vectors for , (ii) the columns of the matrix are a complete set of cointegrating vectors for . Regarding (i), using (11) and (12), we haveso that has an ARMA structure. Regarding (ii), see the proof of Proposition 2.
We are now ready for our main representation result.
(I) Weak form. Suppose that Assumptions 1, 2, 4, 5 and 6 hold and let be a solution of the difference Equation (10), so that , with defined in (8) and a constant stochastic process. Set . Then a c-dimensional stochastic vector can be determined such that (i) is , (ii) has the error correction representationwhere is a rational matrix with no poles in or on the unit circle, , is and full rank, .
(II) Strong form. Under Assumptions 1, 3, 4, 5 and 6, statement (I) holds with an stable, finite-degree matrix polynomial .
Multiply both sides of by the invertible matrix . We obtain
Taking the first c rows in (15),
This implies thatwhere is a c-dimensional constant stochastic vector. Comparing with (13), . On the other hand,where the last equality has been obtained using (16) and is a suitable polynomial matrix. Thus has an ARMA structure. Moreover, by Assumption 6, is .
By Assumption 5, has no zero at , see (19). On the other hand, (i) if is a zero of then is a zero of , (ii) if is a zero of , , then is a zero of . Therefore, Assumption 3 implies that is zeroless and viceversa. Under Assumption 2, the zeros of lie outside the unit circle. In order to conclude the proof we need inverting in (18).
(I) Under Assumption 3, Proposition 1, part (I), states that there exists an stable, finite-degree polynomial matrix for some p, such that: (i) , (ii) .
(II) Under Assumption 2, by a standard procedure we remove all the zeros of which lie outside the unit circle4, then use Proposition 1, part (I), to left-invert the residual zeroless polynomial, thus obtaining an rational matrix such that (i) has no poles in or on the unit circle (possible poles of are the zeros of , which lie outside the unit circle), (ii) , (iii) . See also Deistler et al. (2010).
Definingand using , we havewith Defining
Definingwe see that and□
Some remarks are in order.
(I) Under our assumption of an ARMA structure, Assumption 1 corresponds to Definition 3.1 in Johansen’s book, see p. 34. Assumption 2 is Johansen’s Assumption 1 (see p. 14), adapted for singularity. Assumption 3 has no counterpart in Johansen’s nonsingular framework. In Section 3.2 we show that under the parameterization adopted in Definition 5, Assumption 3 holds generically.
(II) Simplifying the model by taking , Assumption 5 generalizes to the singular case Johansen’s assumption that is full rank (see Theorem 4.5, p. 55; corresponds to our ). For, assuming that , multiplying the matrix in Assumption 5 by the nonsingular matrix , we obtain that Assumption 5 holds if and only if is full rank. Assumption 5 is used in the proof of Proposition 2 to invert the matrix , which remains on the right-hand side after the removal of the unit roots, see Equation (18), which is the same rôle played by Johansen’s assumption in his proof.
(III) Under , assumption 6 simplifies to . If Assumption 6 is a consequence of Assumption 5. For, if then . On the other hand, is the number of rows of , so that Assumption 5 holds only if Assumption 6 holds. In particular, if and , Assumption 6 is redundant. However if and , so that the rank of is q, then Assumption 5 holds even if . Assumption 6 is necessary in Proposition 2 to prove that the error correction term is , not only stationary.
Uniqueness issues arise with autoregressive representations of singular vectors. For example, suppose that , so that . Representation (14) has an -dimensional error correction term . On the other hand, in this case has full rank q, so that Proposition 1(I)applies and, in spite of cointegration, has an autoregressive representation in differences
In Appendix B.1 we sketch a proof of the statement that in general, has VECM representations with a number of error correction terms ranging from d to c. However, as we show in Appendix B.2, different autoregressive representations of produce the same impulse-response functions. Both in this and the companion paper Barigozzi et al. (2019) the number of error correction terms in the error correction representation for reduced-rank I(1) vectors is always the maximum c. It is worth reporting that, in our experiments with simulated data, the best results in estimation of singular VECMs are obtained using c as the number of error correction terms.
Assume for simplicity that . From Equation (17):
If , Assumption 5 implies that has rank c, so that no c-dimensional vector can be determined such that some of the coordinates of is stationary but not . Thus, according to the definition introduced in Franchi andParuolo (2019), p. 1181, the error term is a “non-cointegrated process.” When and , i.e., , elementary examples can be produced in which is an but not a non-cointegrated process (one is given in Appendix A.2). Thus Assumption 6 only implies that is . Of course, under , the assumption that has rank c, an enhancement of Assumption 6, implies that is a non-cointegrated process. On the other hand, if , i.e., , cannot be a non-cointegrated process.
3.2. Generically, Is a Finite-Degree Polynomial
Suppose that the couple is parameterized as in Definition 3. It easy to see that has generically rank q, so that generically the cointegrating rank of is . In particular, if cointegration is non generic.
It is quite easy to see that this paradoxical result only depends on the choice of a parameter set that is unfit to study cointegration. Our starting point here is that a specific value of c between and has a motivation in economic theory or in statistical inference, and must be therefore built in the parameter set. Thus in Definition 5 below the family of filters is redefined so that generically the cointegrating rank is equal to a given c between and .
(Rational reduced-rank family of filters with cointegrating rank c)Assume that , and . Let be a set of couples , where:
- The matrix has the parameterization
- is an polynomial matrix of degree . .
- Denoting by the vector containing the coefficients of the matrices , , , and , we assume that , where Π is an open subset of such that for :(1) is stable,(2) with the exception of a finite subset of ,(3).
We say that is a rational reduced-rank family of filters with cointegrating rank c.
Assume that . Let be a solution of Equation (10), where belongs to a rational reduced-rank family of filters with cointegrating rank c. For generic values of the parameters in Π, Assumptions 1, 3, 4, 5 and 6 hold. Thus the Strong Form of Proposition 2 holds and has an error correction representationwhere is a finite-degree polynomial matrix.
Part (iii) of Definition 5 implies that Assumptions 1 and 4 hold for all . The sets where Assumptions 5 and 6 do not hold are the intersections of the open set with the algebraic varieties(the variety described by (a) is obtained by equating to zero the determinant of all the submatrices of the matrix between brackets). It is easy to see that the varieties (a) and (b) are not trivial, i.e., that their dimension is lower than . Thus Assumptions 5 and 6 hold generically. The same result holds for Assumption 3. The points of where it is not fulfilled belong to a lower-dimensional algebraic variety. This is proved in Appendix A.1, see in particular Lemma A4. □
It is easy to see that, assuming that , holds generically in Π. Thus, in that case, the error term is generically a non-cointegrated process, see Remark 6.
A general comment on genericity results is in order. Theorems like Proposition 3 or Proposition 1, part (II), show that the subset where some statement does not hold belong to some algebraic variety of lower dimension (see the proof of Proposition 3 in particular), and is therefore negligible from a topological point of view. This suggests the working hypothesis that such subset is negligible from an economic or statistical point of view as well. If, for example, economic theory produces a singular vector with cointegrationg rank c, we may find reasonable to conclude that has representation (14) with a finite autoregressive polynomial. However, a greater degree of certainty is obtained by checking that the parameters of , that are implicit in the theory, do not necessarily lie in one of the three algebraic varieties described in the proof of Proposition 3.
Definition 5 does not assume that has no zeros inside the unit circle. Thus we have not assumed that is fundamental for , see Section 2.2. However, Proposition 3 shows that for generic values of the parameters in , the assumptions of Proposition 2, strong form, hold, Assumption 3 in particular, so that has no zeros of non-unit modulus and therefore inside the unit circle. Thus:
Assume that . Let be a solution of Equation (10), where belongs to a rational reduced-rank family of filters with cointegrating rank c. For generic values of the parameters in Π, is fundamental for .
Note that Propositions 3 and 4 do not hold in the nonsingular case, where no genericity argument can be used to rule out non-unit zeros of , either inside or outside the unit circle. In particular, fundamentalness of for is not generic if .
3.3. Permanent and Transitory Shocks
Let be a matrix whose columns are independent and orthogonal to the columns of , and let
Defining , and , we have
We havewhere , and . All the solutions of the difference equation arewhere is a constant stochastic process, and
As is full rank, we see that is driven by the permanent shocks , and by the d temporary shocks . In representation (21), the component is the common-trend of Stock and Watson (1988). Note that the number of permanent shocks is obtained as r minus the cointegrating rank, as usual. However, the number of transitory shocks is only , as though transitory shocks had a zero coefficient.
3.4. VECMs and Unrestricted VARs in The Levels
Several papers have addressed the issue of whether and when an error correction model or an unrestricted VAR in the levels should be used for estimation in the case of nonsingular cointegrated vectors: Sims et al. (1990) have shown that the parameters of a cointegrated VAR are consistently estimated using an unrestricted VAR in the levels; on the other hand, Phillips (1998) shows that if the variables are cointegrated, the long-run features of the impulse-response functions are consistently estimated only if the unit roots are explicitly taken into account, that is within a VECM specification. The simulation exercise described below provides evidence in favour of the VECM specification in the singular case.
- For each replication, we estimate a (misspecified) VAR in differences (DVAR), a VAR in the levels (LVAR) and a VECM, as in Johansen (1988, 1991), assuming known c, the degree of and that of . For the VAR in differences the impulse-response functions for are cumulated to obtain impulse-response function for . The root mean square error between estimated and actual impulse-response functions is computed for each replication using all 12 impulse-responses and averaged over all replications.
The results are shown in Table 1. We see that the RMSE of both the VECM and the LVAR decreases as T increases. However, for all values of T, the RMSE of the VECM stabilizes as the lag increases, whereas it deteriorates for the LVAR, in line with the claim that the long-run response of the variables are better estimated with the VECM. The performance of the misspecified DVAR is uniformly poor with the exception of lag zero.
4. Cointegration of the Observable Variables in a DFM
Consider again the factor model , rewritten here aswhere is , with . The relationship between cointegration of the factors and cointegration of the variables is now considered.
Let us recall that the the common factors are assumed to be orthogonal to the idiosyncratic components for all , i.e., for all , see the Introduction. The other assumptions on model (22) are asymptotic, see e.g., Forni et al. (2000); Forni and Lippi (2001); (Stock and Watson 2002a, 2002b), and put no restriction on the matrix and the vector for a given finite n. In particular, the first r eigenvalues of the matrix must diverge as , but this has no implications on the rank of the matrix corresponding to, say, . However, as we see in Proposition 5 (iii), if the idiosyncratic components are , then, independently of , all p-dimensional subvectors of are cointegrated for , which is at odds with what is observed in the macroeconomic datasets analyzed in the empirical Dynamic Factor Model literature. This motivates assuming that is . In that case, see Proposition 5 (i), cointegration of requires that both the common and the idiosyncratic components are cointegrated. Some results are collected in the statement below.
Let be a p-dimensional subvector of , . Denote by and the cointegrating rank of and respectively. Both range from p, stationarity, to 0, no cointegration.
- is cointegrated only if and are both cointegrated.
- If then is cointegrated. If and rank then is cointegrated.
- Let and be the cointegrating spaces of and respectively. The vector is cointegrated if and only if the intersection of and contains non-zero vectors. In particular, (a) if and then is cointegrated, (b) if and is stationary then is cointegrated.
Because and are orthogonal for all , the spectral densities of , , fulfill:
Now, (23) implies thatwhere denotes the smallest eigenvalue of the hermitian matrix A; this is one of the Weyl’s inequalities, see Franklin (2000), p. 157, Theorem 1. Because the spectral density matrices are non-negative definite, the right hand side in (24) vanishes if and only if both terms on the right hand side vanish, i.e., the spectral density of is singular at zero if and only if the spectral densities of and are singular at zero. By definition 4, (i) is proved.
Without loss of generality we can assume that . By substituting (21) in (22), we obtainwhere on the right hand side the only non-stationary terms are and possibly . By recalling that where is of dimension and rank , and by defining and , we can rewrite (25) as
For :where and have an obvious definition. Of course cointegration of the common components is equivalent to cointegration of , which in turn is equivalent to rank. Statement (ii) follows from
The first part of (iii) is obvious. Assume now that . If , i.e., if , then the intersection between and is non-trivial, so that is cointegrated. □
5. Summary and Conclusions
The paper studies representation theory for singular stochastic vectors, the factors of an Dynamic Factor Model in particular. Singular vectors are cointegrated, with a cointegrating rank c equal to , the dimension of minus its rank, plus with .
If has rational spectral density, under assumptions that generalize to the singular case those in Johansen (1995), we show that has an error correction representation with c error terms, thus generalizing the Granger representation theorem (from MA to AR) to the singular case. Important consequences of singularity are that generically: (i) the autoregressive matrix polynomial of the error correction representation is of finite degree, (ii) the white noise vector driving is fundamental.
We find that is driven by permanent shocks and transitory shocks, not c as in the nonsingular case.
Using simulated data generated by a simple singular VECM, confirms previous results, obtained for nonsingular vectors, showing that under cointegration the long-run features of impulse-response functions are better estimated using a VECM rather than a VAR in the levels.
In Section 4 we argue that stationarity of the idiosyncratic components in a DFM produce an amount of cointegration for the observable variables that is not observed in the datasets that are standard in applied Dynamic Factor Model literature. Thus the idiosyncratic vector in those datasets is likely to be , so that an estimation strategy robust to the assumption that some of the idiosyncratic variables are should be preferred.
The results in this paper are the basis for estimation of Dynamic Factor Models with cointegrated factors, which is developed in the companion paper (Barigozzi et al. 2019).
All authors contributed equally to the paper. All authors have read and agreed to the published version of the manuscript.
This research received no external funding.
Dietmar Bauer, Manfred Deistler, Massimo Franchi, Martin Wagner, three anonymous referees and the Editors of this Special Issue gave important suggestions for improvements. We also thank the participants to the Workshop on Estimation and Inference Theory for Cointegrated Processes in the State Space Representation, Technische Universität Dortmund, January 2016. Part of this paper was written while Matteo Luciani was chargé de recherches F.R.S.- F.N.R.S., and he gratefully acknowledges their financial support. Of course we are responsible for any remaining errors.
The views expressed in this paper are those of the authors and do not necessarily reflect those of the Board of Governors or the Federal Reserve System.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A. Proofs
Appendix A.1. Assumption 3 Holds Generically
Proving that Assumption 3 holds generically is equivalent to proving that is generically zeroless, see the argument below Equation (20).
We need some preliminary results. Lemma A1, though quite easy, is not completely standard and is therefore carefully stated and proved below. Regarding notation, to avoid possible misunderstandings, let us recall that vectors and matrices are always denoted by boldface symbols, while light symbols denote scalars, see Lemmas A1 and A2 in particular.
Let , , be scalar polynomials defined on , let and be the statementfor example the statement that all the minors of vanish, i.e., that . Let Π be an open subset of . If Q is false for one point , then Q is generically false in Π.
Let be the closure in (in the topology of ) of the subset of where Q is true. Suppose that Q is not generically false in . Then the interior of in , call it , is not empty. As is open, is open both in the topology of and of . On the other hand a polynomial function defined on vanishes on an open set if and only if it vanishes on the whole , which contradicts the existence of a point in where Q is false. □
Consider the scalar polynomialswith and , and let , and , , be the roots of A and B, respectively. Then: (i)where R is a polynomial function which is called the resultant of A and B. (ii) The resultant vanishes if and only if A and B have a common root. (iii) Suppose that the coefficients and are polynomial functions of , where Π is an open subset of . If there exists a point such that , , and , then generically in Π the polynomials A and B have no common roots.
Recall that a zero of is a complex number such that . If has two submatrices whose determinants have no common roots, then is zeroless.
If is a zero of , then is a zero of all the submatrices of . □
For the statement and proof of our last result it is convenient to make explicit the dependence of the matrix and its submatrices on the vector . Thus we use , etc. The parameters of the matrix play no role here. Hence, with no loss of generality, we assume , so that Lemmas A2–A4 below imply that Assumption 3 holds generically in .
Let be all the submatrices of and let be the leading coefficient of and is the resultant of and . There exist i, j, such that
Assume that . To each there corresponds the matrix□
Of course, the definition of makes sense for all , see Equation (19). Let and be the matrices obtained from by removing the first and the last row respectively. We have:
We will construct a point such that: (A) the coefficient of in and the coefficient of in (the leading coefficients) do not vanish, (B) the resultant of and does not vanish.
Let us firstly define a family of matrices, denoted by , obtained by specifying , , , and in the following way:where:the entries e, , , and being scalar polynomials of degree .
We denote by the vector including the coefficients of the polynomials , and , , a total of coefficients, by the vector including the coefficients of the polynomials e, , and , a total of coefficients, by the vector including the zeros and the ones in the definition of , , , and define , which is a -dimensional parameter vector. We put no restriction on and , so that both can take any value in , with . Note that does not necessarily belong to . We have:
The matrix has zero entries except for the diagonal joining the positions and , and the diagonal joining and . The matrices and are upper- and lower-triangular, respectively, and
Note that does not depend on , while does not depend on . Thus we use the notation , , , . Now:
- Let be such that none of the leading coefficients of the polynomials e, and vanishes. Of course .
- Let be a root of . If then is not a root of for all . Suppose that is a root of , for some j. As the parameters of the polynomials and are free to vary in , then, generically in , . Iterating for all roots of , generically in , and have no roots in common. Moreover, generically in , . Thus, there exists such that (a) , (b) and have no roots in common.
- Now let , so that
Using (i) and (ii), (A) the leading coefficients of and do not vanish, (B) and have no root in common so that their resultant does not vanish. This proves the proposition for .
Generalizing this result to is easy. Let us define the family in the following way: (a) specify , , and as in the definition of , (b) then letWe have:It is easy to see that the lower submatrix of is identical to the matrix in (A1).
Appendix A.2. if R>Q and C≤Q, Assumptions 5 and 6 Do Not Imply That e t Is a Non-Cointegrated I(0) Process.
Let , , ,
In this case and , so that (see Remark 6). We have
We see that Assumptions 5 and 6 hold. However, , so that , though being , is not a non-cointegrated process. On the other hand, if the entry of is 1 instead of 0, is non-cointegrated.
Appendix B. Non Uniqueness
In Proposition 3 we prove that a singular vector with cointegrating rank c has a finite error correction representation with c error terms. On the other hand, as we have seen in Remark 5, when the singular vector has also an autoregressive representation in the differences, i.e., a representation with zero error terms. In Appendix B.1 we give an example hinting that has error correction representations with any number of error terms between d and c. However, in Appendix B.2 we show that all such representations produce the same impulse-response functions.
Appendix B.1. Alternative Representations with Different Numbers of Error Terms
Let and consider the following example, with , , , so that :
We have,where gathers the second and third terms in . If the assumptions of Proposition 2 hold, we obtain an error correction representation with error terms
However, we also have
Under suitable assumptions on the coefficients and , assuming in particular that the matrixis nonsingular, the matrix is zeroless and has therefore a finite-degree left inverse. Proceeding as in Proposition 2, we obtain an alternative error correction representation with just one error term, namely .
This example should be sufficient to convey the idea that admits error correction representations with a minimum d and a maximum of error terms.
The problem of error correction representations, with different numbers of error terms, has been recently addressed in Deistler and Wagner (2017). An implication of their main result (see Theorem 1, p. 41) is that if has the error correction representationand (the number of error terms is not the maximum), then and are not left coprime.
The consequences of Deistler and Wagner’s paper have not yet been developed. In Propositions 2 and 3 we have only considered representations with c error terms. On non-uniqueness of autoregressive representations for singular vectors with rational spectral density see also Chen et al. (2011); Anderson et al. (2012); Forni et al. (2015).
Appendix B.2. Uniqueness of Impulse-Response Functions
Suppose that the assumptions of Proposition 2, weak form, hold. Let be a solution of Equation (10), so thatand suppose that has the autoregressive representationwhere is a rational matrix with poles outside the unit circle, , is a nonsingular q-dimensional white noise, is a full rank matrix5. We have
The assumption that is full rank and the argument used e.g., in Brockwell and Davis (1991), p. 111, Problem 3.8, imply that is fundamental for . Thus , where is a nonsingular matrix (see Rozanov (1967), p. 57), and .
Appendix C. Data Generating Process for the Simulations
The simulation results of Section 3.4 are obtained using the following specification of (14):where , , , the degree of is 2, so that the degree of is 1. is generated using the factorizationwhere and are matrix polynomials with all their roots outside the unit circle, and
(see Watson 1994). To get a VAR(2) we set , and , and then, by rewriting , we get , and .
Regarding the generation of the data, the diagonal entries of the matrix are drawn from a uniform distribution between and , while the extra–diagonal entries are drawn from a uniform distribution between 0 and . is then multiplied by a scalar so that its largest eigenvalue is . The matrix is generated as in Bai and Ng (2007): (1) is an diagonal matrix of rank q where is drawn from the uniform distribution between and , (2) is obtained by orthogonalizing an uniform random matrix, (3) is equal to the first q columns of the matrix . Lastly, the orthogonal matrix is such that the upper submatrix of is lower triangular. The results are based on 1000 replications. The matrices , and are generated only once (the numerical values are available on request) so that the set of impulse responses to be estimated is the same for all replications, whereas the vector is redrawn from at each replication.
- Amengual, Dante, and Mark W. Watson. 2007. Consistent estimation of the number of dynamic factors in a large N and T panel. Journal of Business and Economic Statistics 25: 91–96. [Google Scholar] [CrossRef]
- Anderson, Brian DO, and Manfred Deistler. 2008a. Generalized linear dynamic factor models–A structure theory. Paper presented at IEEE Conference on Decision and Control, Cancun, Mexico, December 9–11. [Google Scholar]
- Anderson, Brian DO, and Manfred Deistler. 2008b. Properties of zero-free transfer function matrices. SICE Journal of Control, Measurement and System Integration 1: 284–92. [Google Scholar] [CrossRef]
- Anderson, Brian DO, Manfred Deistler, Weitian Chen, and Alexander Filler. 2012. Autoregressive models of singular spectral matrices. Automatica 48: 2843–49. [Google Scholar] [CrossRef]
- Bai, Jushan, and Serena Ng. 2007. Determining the number of primitive shocks in factor models. Journal of Business and Economic Statistics 25: 52–60. [Google Scholar] [CrossRef]
- Banerjee, Anindya, Massimiliano Marcellino, and Igor Masten. 2014. Forecasting with factor-augmented error correction models. International Journal of Forecasting 30: 589–612. [Google Scholar] [CrossRef]
- Banerjee, Anindya, Massimiliano Marcellino, and Igor Masten. 2017. Structural FECM: Cointegration in large–scale structural FAVAR models. Journal of Applied Econometrics 32: 1069–86. [Google Scholar] [CrossRef]
- Barigozzi, Matteo, Antonio M. Conti, and Matteo Luciani. 2014. Do euro area countries respond asymmetrically to the common monetary policy? Oxford Bulletin of Economics and Statistics 76: 693–714. [Google Scholar] [CrossRef]
- Barigozzi, Matteo, Marco Lippi, and Matteo Luciani. 2019. Large-dimensional dynamic factor models: Estimation of impulse-response functions with I(1) cointegrated factors. arXiv. [Google Scholar]
- Bauer, Dietmar, and Martin Wagner. 2012. A State Space Canonical Form For Unit Root Processes. Econometric Theory 28: 1313–49. [Google Scholar] [CrossRef]
- Brockwell, Peter J., and Richard A. Davis. 1991. Time Series: Theory and Methods, 2nd ed. New York: Springer. [Google Scholar]
- Canova, Fabio. 2007. Methods for Applied Macroeconomics. Princeton: Princeton University Press. [Google Scholar]
- Chen, Weitian, Brian DO Anderson, Manfred Deistler, and Alexander Filler. 2011. Solutions of Yule-Walker equations for singular AR processes. Journal of Time Series Analysis 32: 531–38. [Google Scholar] [CrossRef]
- Deistler, Manfred, Brian DO Anderson, A. Filler, Ch Zinner, and W. Chen. 2010. Generalized linear dynamic factor models: An approach via singular autoregressions. European Journal of Control 16: 211–24. [Google Scholar] [CrossRef]
- Deistler, Manfred, and Martin Wagner. 2017. Cointegration in singular ARMA models. Economics Letters 155: 39–42. [Google Scholar] [CrossRef]
- Forni, Mario, and Luca Gambetti. 2010. The dynamic effects of monetary policy: A structural factor model approach. Journal of Monetary Economics 57: 203–16. [Google Scholar] [CrossRef]
- Forni, Mario, Domenico Giannone, Marco Lippi, and Lucrezia Reichlin. 2009. Opening the Black Box: Structural Factor Models versus Structural VARs. Econometric Theory 25: 1319–47. [Google Scholar] [CrossRef]
- Forni, Mario, Marc Hallin, Marco Lippi, and Lucrezia Reichlin. 2000. The Generalized Dynamic Factor Model: Identification and Estimation. The Review of Economics and Statistics 82: 540–54. [Google Scholar] [CrossRef]
- Forni, Mario, Marc Hallin, Marco Lippi, and Paolo Zaffaroni. 2015. Dynamic factor models with infinite-dimensional factor spaces: One-sided representations. Journal of Econometrics 185: 359–71. [Google Scholar] [CrossRef]
- Forni, Mario, and Marco Lippi. 2001. The Generalized Dynamic Factor Model: Representation Theory. Econometric Theory 17: 1113–41. [Google Scholar] [CrossRef]
- Franchi, Massimo, and Paolo Paruolo. 2019. A general inversion theorem for cointegration. Econometric Reviews 38: 1176–201. [Google Scholar] [CrossRef]
- Franklin, J. N. 2000. Matrix Theory, 2nd ed. New York: Dover Publications. [Google Scholar]
- Giannone, Domenico, Lucrezia Reichlin, and Luca Sala. 2005. Monetary policy in real time. In NBER Macroeconomics Annual 2004. Edited by Mark Gertler and Kenneth Rogoff. Cambridge: MIT Press, chp. 3. pp. 161–224. [Google Scholar]
- Gregoir, Stéphane. 1999. Multivariate Time Series With Various Hidden Unit Roots, Part I. Econometric Theory 15: 435–68. [Google Scholar] [CrossRef]
- Johansen, Søren. 1988. Statistical analysis of cointegration vectors. Journal of Economic Dynamics and Control 12: 231–54. [Google Scholar] [CrossRef]
- Johansen, Søren. 1991. Estimation and hypothesis testing of cointegration vectors in Gaussian vector autoregressive models. Econometrica 59: 1551–80. [Google Scholar] [CrossRef]
- Johansen, Søren. 1995. Likelihood-Based Inference in Cointegrated Vector Autoregressive Models, 1st ed. Oxford: Oxford University Press. [Google Scholar]
- Lancaster, Peter, and Miron Tismenetsky. 1985. The Theory of Matrices, 2nd ed. New York: Academic Press. [Google Scholar]
- Luciani, Matteo. 2015. Monetary policy and the housing market: A structural factor analysis. Journal of Applied Econometrics 30: 199–218. [Google Scholar] [CrossRef]
- Phillips, Peter C.B. 1998. Impulse response and forecast error variance asymptotics in nonstationary VARs. Journal of Econometrics 83: 21–56. [Google Scholar] [CrossRef]
- Rozanov, Yu. A. 1967. Stationary Random Processes. San Francisco: Holden-Day. [Google Scholar]
- Sargent, Thomas J. 1989. Two Models of Measurements and the Investment Accelerator. Journal of Political Economy 97: 251–87. [Google Scholar] [CrossRef]
- Sims, Christopher, James H. Stock, and Mark W. Watson. 1990. Inference in linear time series models with some unit roots. Econometrica 58: 113–44. [Google Scholar] [CrossRef]
- Stock, James H., and Mark W. Watson. 1988. Testing for common trends. Journal of the American Statistical Association 83: 1097–107. [Google Scholar] [CrossRef]
- Stock, James H., and Mark W. Watson. 2002a. Forecasting using principal components from a large number of predictors. Journal of the American Statistical Association 97: 1167–79. [Google Scholar] [CrossRef]
- Stock, James H., and Mark W. Watson. 2002b. Macroeconomic forecasting using diffusion indexes. Journal of Business and Economic Statistics 20: 147–62. [Google Scholar] [CrossRef]
- Stock, James H., and Mark W. Watson. 2016. Dynamic factor models, factor-augmented vector autoregressions, and structural vector autoregressions in macroeconomics. In Handbook of Macroeconomics. Edited by John B. Taylor and Harald Uhlig. Amsterdam: North Holland, Elsevier, vol. 2A, chp. 8. pp. 415–525. [Google Scholar]
- Van der Waerden, Bartel Leendert. 1953. Modern Algebra, 2nd ed. New York: Frederick Ungar, vol. I. [Google Scholar]
- Watson, Mark W. 1994. Vector autoregressions and cointegration. In Handbook of Econometrics. Edited by Robert F. Engle and Daniel L. McFadden. Amsterdam: North Holland, Elsevier, vol. 4, chp. 47. pp. 2843–915. [Google Scholar]
Usually orthonormality is assumed. This is convenient but not necessary in the present paper.
To our knowledge, the present paper is the first to study cointegration and error correction representations for singular vectors, the factors of dynamic factor models in particular. An error correction model in the DFM framework is studied in (Banerjee et al. 2014, 2017). However, their focus is on the relationship between the observable variables and the factors. Their error correction term is a linear combination of the variables and the factors , which is stationary if the idiosyncratic components are stationary (so that the x’s and the factors are cointegrated). Because of this and other differences their results are not directly comparable to those in the present paper.
In the square case, r = q, Assumption 3 holds if and only if M(z) is unimodular.
If is a zero of , multiply by an invertible matrix such that is a zero of, say, the first row of . Then multiply by the diagonal matrix with in position and unity elsewhere on the main diagonal. Iterating, all the zeros of are removed.
Table 1. Monte Carlo Simulations. VECM: , q = 3, .
Root mean squared errors at different lags, when estimating the impulse-response functions of the simulated variables to the shocks . Estimation is carried out using three different autoregressive representations: a VAR for (DVAR), a VAR for (LVAR), and a VECM with error terms (VECM). The results are based on 1000 replications. For the data generating process see Appendix C. The RMSEs are obtained averaging over all replications and all responses.
© 2020 by Matteo Barigozzi and Marco Lippi. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).