Designing Bivariate Auto-Regressive Timeseries with Controlled Granger Causality

In this manuscript, we analyze a bivariate vector auto-regressive (VAR) model in order to draw the design principle of a timeseries with a controlled statistical inter-relationship. We show how to generate bivariate timeseries with given covariance and Granger causality (or, equivalently, transfer entropy), and show the trade-off relationship between these two types of statistical interaction. In principle, covariance and Granger causality are independently controllable, but the feasible ranges of their values which allow the VAR to be proper and have a stationary distribution are constrained by each other. Thus, our analysis identifies the essential tri-lemma structure among the stability and properness of VAR, the controllability of covariance, and that of Granger causality.


Background and Motivation
In the field of cognitive psychology, the human perception of the life-likeness (called animacy perception) of one or multiple moving geometric patterns has been studied for decades [1][2][3][4][5]. There are multiple findings on the effect of "synchrony" or "temporal contingency" between multiple moving points on animacy perception. Findings from one line of research [2] have suggested that a higher degree of "temporal contingency" of the moving objects is related to a higher likelihood of animacy perception. Findings from the other line of research [6] have suggested that the highest "temporal contingency", presented in the form of perfect synchronization, would decrease the likelihood of animacy perception.
These two lines of research have together suggested the existence of multiple types of "temporal contingency". Nevertheless, this past research does not appear to clarify these types. Further, confusion surrounding these two distinct types of effects have led to two lines of apparently conflicting effects of "temporal contingency".
With this potential conflict in the literature on animacy perception in mind, we explore a theoretical framework which can generate timeseries of multiple random variables with multiple distinct types of statistical dependency. One such system, which is sufficiently simple and readily manipulable, is vector auto-regression (VAR). Vector auto-regression is a random process for generating multivariate timeseries for a given set of parameters. In this manuscript, we specifically consider only bivariate VAR, which is a minimal system with interaction between two moving points.

Vector Auto-Regressive Model, Granger Causality, and Transfer Entropy
Importantly, bivariate VAR, a series of paired random variables (x t , y t ) for t = 0, 1, . . ., has two types of statistical dependency-that is, the correlation and Granger causality of a timeseries [7], which has been identified as transfer entropy [8] of the timeseries generated by a Gaussian process by [9]. The correlation between univariate series x and y is a statistical dependency between x t and y t in the limit t → ∞ (if it exists), while a Granger causality from y to x is that between x t and y t−1 given x t−1 in the limit t → ∞ (if it exists).
Conceptually, correlation captures a type of similarity between two timeseries, whereas Granger causality captures the "reactiveness" of one timeseries to another.
Thus, given these differences in theory, our goal is to propose a theoretical method to generate a bivariate random timeseries with a desired correlation and two types. The Granger causality of VAR has also been considered in other fields. In econometrics, the field in which it was originally proposed [7], Granger causality has been used as a measure of interaction between a pair of economic timeseries [10][11][12]. It has also been used in general behavioral sciences [13,14], particularly in computational neuroscience [15][16][17][18]. Such an in-principle data generation technique would be vital to the testing of any hypothesis on the empirical nature of timeseries (e.g., animacy perception) in the empirical sciences using VAR and Granger causality, as mentioned above. To our knowledge, however, there has been no mathematical analysis of the theoretical limitation of such a data generation technique for a given statistics.
Thus, we first need to explore the mathematical relationship among the parameters in a VAR with the correlation and Granger causality in a timeseries generated from it. In this paper, we therefore explore a theoretical structure of bivariate VAR from the designer's perspective, and analyze a mathematical limit to the extent which we can simultaneously control correlation and Granger causality of a bivariate timeseries.
This paper is written with the following structure. In Section 2, the VAR model is defined, from which a set of basic statistical properties of the VAR model are derived, such as Granger causality (Section 3) with a set of parameters. In Section 4, the existence of the stationary distribution of the VAR is analyzed. This is a foundation which sets the limit of a controllable set of parameters. In Section 5, we give a method to derive the parameters of VAR for a given set of statistics in bivariate timeseries. In Section 6, the mathematical analysis provided in this paper is summarized and a remark on the design principle of bivariate timeseries generated by VAR is added.

Vector Auto-Regression (VAR)
In theory, Granger causality (GC) is the transfer entropy of random variables in a bivariate vector auto-regression (VAR) model up to a constant factor of 2, if the VAR model has a stationary distribution [9]. Thus, it is straightforward to start with the bivariate VAR and derive its transfer entropy. In this way, we can derive a rich mathematical relationship between GC and the properties of VAR, rather than just a statistics of the bivariate timeseries. Definition 1. For some real vector µ ∈ R 2 and some positive-definite matrix Σ ∈ R 2×2 , suppose the random variable t for every integer t = 0, 1, . . . is drawn from the bivariate normal distribution N( t |µ, Σ) with mean µ and variance Σ. Define the initial vector by v 0 = x 0 y 0 , and for t ≥ 0 and a given coefficient matrix with real entries A := a 0,0 a 0,1 a 1,0 a 1,1 ∈ R 2×2 , define the random Then, bivariate vector auto-regression is defined by the semi-infinite series of random variables In general, one can generate a timeseries v 0 , v 1 , . . . by fixing a set of the VAR parameters, the coefficient matrix A, and the base covariance matrix Σ, where the base mean vector µ is omitted as its effect is lost in the limit t → ∞ when the VAR is stationary. The stationary correlation (covariance)Σ and Granger causality G 0 and G 1 defined later are the statistics of the timeseries generated by a VAR model ( Figure 1). In what follows, we first explore the forward relationship of how the statisticsΣ and G 0 , G 1 are given by the VAR parameters (A, Σ). We then consider the backward relationship in which the VAR parameters (A, Σ) suffice to generate a timeseries with given desired timeseries statistics (G 0 , G 1 ,Σ).

Marginal Distribution of the VAR at Each
Step Lemma 1 (Marginal distribution of the VAR random variables at each step). The VAR model with the initial vector v 0 ∈ R 2 and the coefficient matrix A ∈ R 2×2 has the bivariate normal distribution N(v t |µ t , Σ t ) as its marginal distribution of the random variable v t at each step t = 0, 1, . . ., where Proof. By Definition 1, Lemma 1 holds for t = 0. For t + 1 > 0, we prove Lemma 1 by assuming that it holds up to t ≥ 0. By this assumption held for t, we have the distribution of v t ∈ R 2 as the bivariate normal distribution with its mean µ t and its covariance matrix Σ t . Then, the random variable Av t is distributed by the normal distribution with its mean Aµ t and the covariance matrix AΣ t A . The random variable t is distributed by the following normal distribution: Thus, by VAR, Equation (1), we have the random variable v t+1 := Av t + t which has a distribution calculated by the following integral: Calculating this, we have Thus defining by µ t+1 := Aµ t and Σ t+1 = AΣ t A + Σ, Lemma 1 holds for t + 1. By expanding this, we have the Lemma 1 for any integer t ≥ 0.

Stability of VAR: Lyapunov Equation
By Lemma 1, the mean and covariance matrix of the random variable at the tth step are From this, we have the stationary distribution if and only if the absolute values of all the eigenvalues λ 0 , λ 1 ∈ C of the coefficient matrix A are less than 1. If there is such a stationary distribution, we call the VAR stable, and its stationary distribution is the bivariate normal distribution where the stationary mean vectorv ∈ R 2 and stationary covariance matrixΣ ∈ R 2×2 are defined as follows. If the VAR is stable, we have the following Lyapunov equation of the stationary covariance matrixΣ ∈ R 2×2 : The Lyapunov equation is solved analytically by where I d ∈ R d×d is the d th order identity matrix, ⊗ denotes the Kronecker product, and vec(X) for any matrix The Lyapunov Equation (4) has the solution forΣ if the VAR is stable, but not vice versa. This is shown by Lemma 4.

Transfer Entropy and Granger Causality
In [9], the transfer entropy of an appropriate triplet of variables in the VAR model is shown to be equivalent to Granger causality up to the constant factor 2. Following this guide, we define this quantity as the Granger causality of the VAR model, below.
Although this relationship has been known in a more general form [9], we re-derive it for bivariate VAR in order to later analyze the structure of VAR and GCs in depth-for example, its upper and lower bounds (Lemma 3), stability (Section 4), and design principle (Section 5).

Definition 2.
If VAR with its random variables v t = (x t , y t ) ∈ R 2 for t = 0, 1, . . . is stable, transfer entropy from y to x is defined by and the transfer entropy from x to y is defined by where the differential entropy of random variable x with its probability density function P is and the conditional entropy is In particular, we call two times of transfer entropy Granger causality denoted by Specifically, GCs are specifically written by the terms of the VAR parameters in the following lemma.
Lemma 2 (Granger causality). If a stable VAR has its covariance matrix, coefficient matrix, and stationary matrix each Granger causality of this VAR for i = 0, 1 is Proof. In general, the differential entropy of multivariate normal distribution N(v|µ, Σ) is the two marginal probability distributions of x t , y t are Thus, the conditional entropy of x t+1 and y t+1 given v t = (x t , y t ) are H(x t+1 |x t , y t ) = 1 2 log |2πeσ 0,0 | and H(y t+1 |x t , y t ) = 1 2 log |2πeσ 1,1 |.
With the conditional probability distribution and marginal probability distribution the joint probability distribution of v t and v t+1 is their product Specifically, this quad-variate normal distribution is From this joint probability distribution P(v t+1 , v t ), we drive the marginal distributions where for i = 0, 1 the mean vectors and covariance matrices are defined as follows: with the unit vectors e 0 := (1, 0) , e 1 := (0, 1) . Thus, we have the joint entropy of x t and x t+1 and the marginal distribution of x t Using these, we have the conditional entropy By the stability of VAR, the Lyapunov Equation (4) holds, and this conditional entropy in the limit t → ∞ is Applying Definition 2 and denoting by entries in the stationary covariance matrixΣ = σ 0,0σ0,1 σ 1,0σ1,1 , we have Similarly, we have Let us define for i = 0, 1 By the Lyapunov equation, σ i,i =σ i,i − e i AΣA e i . Applying this to δ i , we have As The Granger causality has its lower and upper bounds in theory. Although these bounds may be further narrowed by considering the stability of the VAR, what follows below are the theoretical bounds regardless of the stability of the VAR.

Lemma 3 (The upper and lower bound for Granger causality).
For each i = 0, 1, Granger causality G i has the following bounds: where γ i :=σ i,i σ i,i ≥ 1 due to the Lyapunov Equation (4). The lower bound G i = 0 is given only if The upper bound G i = log γ i is given only if Proof. As the stationary covariance matrix is (semi-)positive definite, det Σ ≥ 0. Thus, the lower bound of Granger causality is G i ≥ log(1) = 0 and this bound is only reacheable when a 2 i,1−i det Σ = 0. Modifying (13) and (14), for i = 0, 1 we have The upper bound Lemma 3 can also be obtained by the following informationtheoretic identity:

Stability and Constraints of VAR
In this study, we primarily consider the class of stable VAR models with a proper set of parameters. In this class, the statistical nature of any VAR is characterized by the base covariance matrix Σ ∈ R 2×2 , coefficient matrix A ∈ R 2×2 , and stationary covariance matrix Σ ∈ R 2×2 . Let us denote the set of (strictly) positive definite matrices by We will briefly show that the stable set R 2×2 * includes all and only coefficient matrices of stable bivariate VAR models.
With this notation of the set of matrices, the two conditions that any proper VAR model needs to satisfy are as follows.

Properness
To have a proper (non-degenerated) bivariate normal distribution in a VAR model, its base covariance matrix Σ and stationary covariance matrixΣ need to satisfy Σ,Σ ∈ R 2×2 + . The set of positive-definite matrices is equivalently written with the entries of the following matrix:

Stability of VAR
As stated previously in Section 2.2, the stability of VAR is primarily characterized by the eigenvalues of the coefficient matrix A. However, this condition is equivalent to A ∈ R 2×2 * , as shown by the following lemma.

Lemma 4. A given bivariate VAR model with its coefficient matrix
Proof. Let λ be an eigenvalue of the coefficient matrix A. Such an eigenvalue then satisfies If a VAR is stable, this eigenvalue needs to satisfy |λ| < 1. As (24) is rewritten by we analyze this condition on (24) for the following two cases with λ being real or non-real: 1.
If λ is real, this stability condition is equivalent to 2.
If λ is not real, this stability condition is equivalent with If λ of (24) is non-real (Case 2), λ (and its conjugate) is with the imaginary unit denoted by j.
With the inequality (27), the stability condition in this case is If λ of (24) is real, tr(A) 2 − 4det(A) ≥ 0 and Combining (30) and (31), we have |tr(A)| − 1 < det(A). This inequality with (26), . Find for an arbitrary A ∈ R 2×2 we have the following two inequalities: and The inequality (34) holds equality for and only for tr(A) = 2, and the inequality (34) holds equality for and only for det(A) = 1. As both of these equality conditions do not hold under (29), (29) is equivalent to Integrating the two inequalities (33) for real λ and (36) for non-real λ, the VAR with the and As the inequality (37) implies 0 < 1+det(A) 2 < 1 and tr(A) < 2, (38) is equivalent to As the upper bound for det(A) in (37) can be implied by det(A) < 1, it is equivalent to Thus, the pair of inequalities (37) and (38) for A is equivalent to the single inequality (40) for A.

Stability and Existence of the Solution for the Lyapunov Equation
Intuitively, it would be reasonable if there was a stationary covariance matrixΣ ∈ R 2×2 + satisfying the Lyapunov Equation (4), if the coefficient matrix is A ∈ R 2×2 * . However, this is not trivial, as the opposite may not be always true: the existence ofΣ ∈ R 2×2 + does not imply A ∈ R 2×2 * . This relationship between A andΣ is stated by the following Theorem 1.
Proof. Find the identity (41) By Lemma 4 and (41), we have det(I 4 − A ⊗ A) > 0. Thus, Lyapunov Equation (5) has the solution forΣ, as the matrix (I 4 − A ⊗ A) is invertible. The converse of this theorem does not hold, as we construct a counter-example of the coefficient matrix A such that det(I 4 − A ⊗ A) < 0, with which there is aΣ ∈ R 2×2 + , but such a VAR is not stable.

Design of Bivariate Timeseries Given GCs
The goal of this study was to derive a design principle of bivariate timeseries generated by a VAR model with the desired correlation and two types of Granger causality. In this section, we explore the inter-dependent relationships among the variables in the VAR. This analysis revealed a trade-off limitation in designing these variables of timeseries.
Specifically, a timeseries with a certain range of desired Granger causality cannot be realized by a stable VAR, in which no stationary covariance is defined in theory.
The set of parameters in any stable VAR model includes The base covariance matrix Σ; • The stationary covariance matrixΣ; and • The two types of Granger causality G 0 , G 1 .
There are equality constraints on these variables: • The variables A, Σ,Σ need to satisfy the Lyapunov Equation (4). • Granger causality G i (i = 0, 1) is the function of a i,1−i , σ i,i , andΣ (Lemma 2). Besides, it is important to know the feasibility of a set of parameters in VAR, which constrains the range of these variables: The Lyapunov equation is then equivalently written with these vectors a 0 , a 1 by the set of the three equationsσ 0,0 − σ 0,0 = a 0Σ a 0 (42) Equations (42) and (43) above imply that each of the vectors a 0 and a 1 are on an ellipsis on each of their planes. This gives the lower bound forσ i,i ≥ σ i,i (i.e., one condition of the properness above), as x Σ x ≥ 0 for any x ∈ R 2 with a positive-definite matrixΣ. Fixing G 0 and G 1 imposes each of the two vectors a 0 and a 1 on the two parallel lines by Thus, the solution of a i which satisfies the Lyapunov equation and the fixed Granger causality is the four intersections of the ellipsis and the two parallel lines (Figure 2). This ellipsis is obtained by scaling and shearing transformation to the standard circle a 2 0,0 + a 2 0,1 = 1. This observation gives the angular parametrization of the solution vector (a 0,0 , a 0,1 ) , which is explicitly stated by Lemma 5 in the next section. on the plane (a 0,0 , a 0,1 ) ∈ R 2 . The solution (a 0 , a 1 ) is four intersections of these two (depicted by the colored points). Granger causality takes its maximum with the largest |a 0,1 | on the ellipsis and its minimum with |a 0,1 | = 0.

Solution A of the Lyapunov Equality GivenΣ, G 0 , and G 1
In what follows, we start with the derivation of the coefficient matrix A as a root of the equality constraint by the Lyapunov Equation (4) and the Granger causality, for a fixed properΣ, σ i,i and G i for each i = 0, 1. The following Lemma 5 gives a necessary condition for the coefficient matrix A ∈ R 2×2 to satisfy the equality conditions above. Note, however, that such a solution A in this equation does not guarantee the stability of the corresponding VAR (i.e., A ∈ R 2×2 * ). This sufficiency is explored in Section 5.2.

Any coefficient matrix A of a root of this Equation (46) is in the form
where each pair of the angles θ 0 ∈ [0, 2π) and θ 1 ∈ [0, 2π) takes one of the two or four pairs satisfying for each i = 0, 1 and P 2 := 0 1 1 0 , Proof. Find that the following pair of equations in (46) is symmetric under exchange of i = 0, 1: Thus, we solve this for i = 0 below, and it holds for i = 1.

Sufficiency of the Solution
For a given set of parameters, Lemma 5 in the previous section gives a set of solutions of the coefficient matrix A for the Lyapunov equation. Note that not all of these solutions A are feasible, in the sense that they satisfy all constraints such as the stability of A ∈ R 2×2 * and the properness of Σ ∈ R 2×2 + , in which Σ can be derived from Lyapunov Equation (4) given A andΣ. The following lemmas provide the sufficient condition for a solution A by checking the properness of Σ and the stability of A. Lemma 6. Suppose A is a solution of Equation (46) in Lemma 5, represented by a pair of (θ 0 , θ 1 ). In this case, Σ ∈ R 2×2 + , if and only if whereη ∈ [0, 2π] is the angler parametrization of the correlation coefficient defined by cosη := σ 0,1 √σ 0,0σ1,1 Proof. Applying the polar representation of the a 0 , a 1 in (47) in Lemma 5 to the third Equation (44) of the Lyapunov equation, we havê By the positive definiteness of Σ, σ 2 0,1 ≤ σ 0,0 σ 1,1 . This inequality applied to (57) gives the lemma.
As well as Lemma 6, Lemma 7 leads the trade-off relationship betweenΣ in the angle fromη and A in the angle from θ 0 , θ 1 . In general, this bound further narrows the upper and lower bounds given by Lemma 3. In general, correlation is limited to close to zero if one wishes for higher Granger causality. On the other hand, the two types of Granger causality are limited to close to zero if one wishes for a higher correlation in the absolute value.

Summary and Potential Usage of the Algorithm
In this paper, we explored the relationship between the VAR parameters and timeseries statistics (Figure 1), and identified the trade-off limitation between the stationary covariancê Σ and Granger causality G 0 , G 1 (Lemma 6). This suggests that the following Algorithm 1 will generate a timeseries with desired statistics. This data-generation algorithm can be used to generate surrogate data [19], which can be used to test whether an empirical timeseries is a sample from a VAR with a given correlation and Granger causality. This algorithm is also useful in analyzing to what extent a class of VAR timeseries varies under the same statistics.

Validity of Granger Causality Estimated on Empirical Timeseries
Our analysis also warns that not all Granger causality (or transfer entropy) is "valid", in the sense that its underlying VAR model is not stable and thus the Granger causality is undefined in theory. In theory, we can identify some value of Granger causality for a finite empirical time series, which is generated by an underlying unstable VAR model without any stationary statistics. Such timeseries statistics will diverge in the long run, but it may be difficult to identify this with a finite empirical timeseries. This asymmetrynamely, that the Granger causality can be calculated numerically but does not guarantee the stability of the underlying VAR-is explicitly demonstrated by Theorem 1. For a given empirical timeseries v = (v 0 , v 1 , . . . , v T ) ∈ R 2×(T+1) , one should calculate not just the Granger causality but also its validity by checking (1) A ∈ R 2×2 * , (2) Σ ∈ R 2×2 + , and (3) Σ 0,1 ≤Σ 0,1 , by calculating the maximum likelihood estimator of (A(v), Σ(v)), such as In fact, Σ(v) is always (semi-)positive definite, as it takes the form of the Schur complement of the (semi-)positive-definite matrix V 0,0 V 0,1 V 1,0 V 1,1 . Thus, this maximum likelihood estimator readily satisfies condition (2).

Future Work
In this paper, we limit the VAR model to be bivariate for simplicity of analysis. We expect it is possible to generalize the current result to any higher dimensional VAR model. In such a generalization, feasible boundaries for the stable VAR models may require further effort to understand.