1. Introduction
In a conceptual exploration of long-run causal order,
Hoover (
2018) applies the CVAR(1) model for the processes
and
, to model a causal graph. The process
is a solution to the equations
where the error terms
are independent identically distributed (i.i.d.) Gaussian variables with mean 0 and variance
and are independent of the errors
, which are (i.i.d.) Gaussian with mean 0 and variance
.
Thus, the stochastic trends, are nonstationary random walks and conditions will be given below for to be that is, nonstationary, but stationary. This will imply that is stationary, so that and cointegrate.
The entry means that causes which is written , and means that and it is further assumed that Note that the model assumes that there are no causal links from to so that is strongly exogenous.
A simple example for three variables,
,
,
, and a trend
T, is the graph
where the matrices are given by
where ∗ indicates a nonzero coefficient.
Provided that
has all eigenvalues in the open unit disk, it is seen that
determines a stationary process defined for all
We define a nonstationary solution to (
1) for
by
Note that the starting values are
It is seen that
,
and
are stationary processes for all
and that
is a solution to Equation (
1). In the following, we assume that
is defined by (
2) for
The paper by Hoover gives a detailed and general discussion of the problems of recovering causal structures from nonstationary observations or subsets of when is unobserved, that is, where the observations are -dimensional and the unobserved processes and are - and m-dimensional respectively, . It is assumed that there are at least as many observations as trends, that is
Model (
1) is therefore rewritten as
Note that there is now a causal link from the observed process
to the unobserved process
if
.
It follows from (
3) that
is
and cointegrated with
cointegrating vectors
see Theorem 1. Therefore,
has an infinite order autoregressive representation, see (
Johansen and Juselius 2014, Lemma 2), which is written as
where the operator norm
is
for some
. The matrices
and
are
of rank
m, and
where
,
. Thus,
is not measurable with respect to
, but
is measurable with respect to
Here, the prediction errors
are i.i.d.
, where
is calculated below. The representation of
similar to (
2), is
where
and
Here,
is a
matrix of full rank for which
, and similarly for
. This shows that
is a cointegrated
process, that is,
is nonstationary, while
and
are stationary.
A statistical analysis, including estimation of
,
, and
can be conducted for the observations
using an approximating finite order CVAR, see
Saikkonen (
1992) and
Saikkonen and Lütkepohl (
1996).
Hoover (
2018) investigates, in particular, whether weak exogeneity for
in the approximating finite order CVAR, that is, a zero row in
is a useful tool for finding the causal structure in the graph.
The present note solves the problem of finding expressions for the parameters
and
in the CVAR(
∞) model (
4) for the observation
, as functions of the parameters in model (
3), and finds conditions on these for the presence of a zero row in
and hence weak exogeneity for
in the approximating finite order CVAR.
2. The Assumptions and Main Results
First, some definitions and assumptions are given, then the main results on
and
are presented and proved in Theorems 1 and 2. These results rely on Theorem A1 on the solution of an algebraic Riccati equation, which is given and proved in the
Appendix A.
In the following, a matrix is called stable, if all eigenvalues are contained in the open unit disk. If A is a matrix of rank , an orthogonal complement, is defined as a matrix of rank for which . If , Note that is only defined up to multiplication from the right by a matrix of full rank. Throughout, and denote conditional expectation and variance given the sigma-field , generated by the observations.
Assumption 1. In Equation (3), it is assumed that (i) , , and are mutually independent and i.i.d. Gaussian with mean zero and variances , , and where and are diagonal matrices,
(ii) , and are stable,
(iii) has full rank m.
Let , , be the solution to (3) given in (2), such that and are stationary. Assumption 1(ii) on
and
M is taken from
Hoover (
2018) to ensure that, for instance, the process
given by the equations
is stationary if the input is stationary, such that the nonstationarity of
in model (
3) is created by the trends
and not by the own dynamics of
as given by
It follows from this assumption that
M is nonsingular, because
is stable, and similarly for
and
Moreover
is nonsingular because
The Main Results
The first result on
is a simple consequence of model (
3).
Theorem 1. Assumption 1 implies that the cointegrating rank is and that the coefficients β and in the CVAR( representation for , see (4), are given for as For has rank and there is no cointegration: .
Proof Theorem of 1. From the model Equation (
3), it follows, by eliminating
from the first two equations, that
Solving for the nonstationary terms gives
Multiplying by
, it is seen that
is stationary, if
By Assumption 1(i),
has rank
so that
has rank
which proves (
6). ☐
The result for
is more involved and is given in Theorem 2. The proof is a further analysis of (
7) and involves first, the representation
in terms of a sum of prediction errors
see (
5), and second, a representation of
as the (weighted) sum of the prediction errors
. The second representation requires a result from control theory on the solution of an algebraic Riccati equation, together with some results based on the Kalman filter for the calculation of the conditional mean and variance of the unobserved processes
given the observations
,
. These are collected as Theorem A1 in the
Appendix A.
For the discussion of these results, it is useful to reformulate (
3) by defining the unobserved variables and errors
and the matrices
One can then show, see Theorem A1, that based on properties of the Gaussian distribution, a recursion can be found for the calculation of
and
and
, using the matrices in (
8) and (
9), by the equations Some
It then follows from results from control theory, that
exists and satisfies the algebraic Riccati equation
Moreover, the prediction errors
are independent
for
and the prediction errors
are independent identically distributed
for
. Finally,
has the representation in the prediction errors,
where
.
Comparing the representation (
5) for
and (
14) for
gives a more precise relation between the coefficients of the nonstationary terms in (
7). The main result of the paper is to show how this leads to expressions for the coefficients
and
as functions of the parameters in model (
3).
Theorem 2. Assumption 1 implies, that the coefficients α and in the CVAR( representation of are given for aswhere Proof of Theorem 2. The left hand side of (
7) has two nonstationary terms. The observation
is represented in (
5) in terms of a random walk in the prediction errors
plus a stationary term, and
is a random walk in
Calculating the conditional expectation given the sigma-field
,
is replaced by
which in (
14) is represented as a weighted sum of
Thus, the conditional expectation of (
7) gives
where the right hand side is bounded in mean:
Setting
and dividing by
it follows from (
5) that
where
is the Brownian motion generated by the i.i.d. prediction errors
From (
14), it can be proved that
This follows by replacing
by
because for
it holds that
Next we can replace
by
as follows: For
the sum
is measurable with respect to both
and
such that
Then
and therefore
which proves (
19).
Finally, setting
and normalizing (
17) by
it follows that in the limit
This relation shows that the coefficient to
is zero, so that
can be chosen as
and therefore
which proves (
15). ☐
3. Two Examples of Simplifying Assumptions
It follows from Theorem 2 that in order to investigate a zero row in
the matrix
V is needed. This is easy to calculate from the recursion (
11), for a given value of the parameters, but the properties of
V are more difficult to evaluate. In general,
does not contain a zero row, but if
the expressions for
and
simplify, so that simple conditions on
and
imply a zero row in
and hence give weak exogeneity in the statistical analysis of the approximating finite order CVAR. This extra condition,
implies that
and
such that
simplifies to
Thus, a condition for a zero row in
is
because
This is simple to check by inspecting the matrices
and
in model (
3). In the next section, two cases are given, where such a simple solution is available.
Case 1 (
M12 = 0)
. If the unobserved process does not cause the observation then Therefore, and from (20) it follows thatThus, α has a zero row if has a zero row.
An example of is the chain where is observed and and hence and Then, because Thus, the first row of is a zero row, such that is weakly exogenous.
To formulate the next case, a definition of strong orthogonality of two matrices is introduced.
Definition 1. Let A be a matrix and B a matrix. Then, A and B are called strongly orthogonal if for all diagonal matrices D, or equivalently if for all .
Thus, if
we assume that row
j of
B is zero, and if
row
j of
A is zero. A simple example is
Thus, the definition means that if two matrices are strongly orthogonal, it is due to the positions of the zeros and not to linear combination of nonzero numbers being zero.
Thus, in particular if and are strongly orthogonal, and if T causes a variable in then does not cause that variable. The expression for V simplifies in the following case.
Lemma 1. If and then and such that
Proof of Lemma 1. We first prove that
is blockdiagonal for
. From (
2), it follows that
Thus, if
denotes the variance of
then
and hence blockdiagonal. Assume, therefore, that
blockdiag(
and consider the expression for
see (
11). In this expression,
is block diagonal (because
and
and
are block diagonal, and the same holds for
Thus, it is enough to show that
is block diagonal. To simplify the notation, define the normalized matrices
Then, by assumption,
so that, using
A direct calculation shows that
and that
such that
is block diagonal.
Then, and hence are block diagonal. Taking the limit for it is seen that also V is block diagonal. ☐
Case 2 (
C2 = 0, and
M12 and
C1 are strongly orthogonal)
. Because and Lemma 1 shows that so that the condition and (20) hold. Moreover, strong orthogonality also implies that such that for some Henceand therefore, a zero row in gives a zero row in α.Consider again the chain but assume now that is not observed. Thus, and Here, T causes and causes so that Note that for all diagonal D because T and cause disjoint subsets of . This, together with , implies that V is block diagonal and that (21) holds. Thus, is weakly exogenous, , if