Next Article in Journal
Ranking Leading Econometrics Journals Using Citations Data from ISI and RePEc
Previous Article in Journal
Structural Panel VARs

Article

# The Geometric Meaning of the Notion of Joint Unpredictability of a Bivariate VAR(1) Stochastic Process

Department of Computer Engineering, Computer Science and Mathematics, University of L'Aquila,Via Vetoio I-67010 Coppito, L'Aquila, I-67100, Italy
Econometrics 2013, 1(3), 207-216; https://doi.org/10.3390/econometrics1030207
Received: 21 August 2013 / Revised: 5 November 2013 / Accepted: 5 November 2013 / Published: 14 November 2013

## Abstract

This paper investigates, in a particular parametric framework, the geometric meaning of joint unpredictability for a bivariate discrete process. In particular, the paper provides a characterization of the joint unpredictability in terms of distance between information sets in an Hilbert space.
Keywords:

## 1. Introduction

Let $( Ω , F , P )$ be a probability space and $y t = [ y 1 , t , y 2 , t ] ′ ; t = 0 , 1 , . . .$ a bivariate stochastic process defined on $( Ω , F , P )$. We consider the differenced process $Δ y t = [ Δ y 1 , t , Δ y 2 , t ] ′ ; t = 1 , . . .$, where Δ is the first-difference operator. Following Caporale and Pittis [1] and Hassapis et al. [2], we say that the process $Δ y t ; t = 1 , . . .$ is jointly unpredictable if
$E ( Δ y t + 1 | σ ( y t , . . . , y 0 ) ) = 0 ∀ t$
where $σ ( y t , . . . , y 0 )$ is the σ-field generated by past vectors $y i$ $i = 0 , . . . , t$.
The goal of this paper is to show that the notion of joint unpredictability, in a particular parametric framework, can be characterized by a geometric condition. This characterization is given in terms of distance between information sets in an Hilbert space. In particular, we will show that the process $Δ y t ; t = 1 , . . .$ is jointly unpredictable if and only if the information contained in its past is `much distant’ from the information contained in its future. Even if our result is not as general as might seem desirable, we think that the intuition gained from this characterization makes the notion of joint unpredictability more clear.
The rest of the paper is organized as follows. Section 2 presents the utilized mathematical framework. Section 3 presents the geometric characterization. Section 4 concludes.

## 2. Preliminaries

Definitions, notation, and preliminary results from Hilbert space theory will be presented prior to establish the main result. An excellent overviews of the applications of Hilbert space methods to time series analysis can be found in Brockwell and Davis [3].
We use the following notations and symbols. Let $( Ω , F , P )$ be a probability space. We consider the Hilbert space $L 2 ( Ω , F , P )$ of all real square integrable random variables on $( Ω , F , P ) .$ The inner product in $L 2 ( Ω , F , P )$ is defined by $〈 z , w 〉 = E ( z w )$ for any $z , w ∈ L 2 ( Ω , F , P ) .$ The space $L 2 ( Ω , F , P )$ is a normed space and the norm is given by $w = E ( w 2 ) 1 / 2 .$ The distance between z,$w ∈ L 2 ( Ω , F , P )$ is $d ( z$,$w ) = z − w$. A sequence {$z n$}$⊂ L 2 ( Ω , F , P )$ is said to converge to a limit point $z ∈ L 2 ( Ω , F , P )$ if $d ( z n , z ) → 0$ as $n → ∞$. A point z$∈ L 2 ( Ω , F , P )$ is a limit point of a set M (subset of $L 2 ( Ω , F , P )$) if it is a limit point of a sequence from M. In particular, M is said to be closed if it contains all its limit points. If S is a arbitrary subset of $L 2 ( Ω , F , P )$, then the set of all $α 1 z 1 + . . . + α h z h$ ($h = 1 , 2 , . . . .$; $α 1 , . . . , α h$ arbitrary real numbers; $z 1 , . . . , z h$ arbitrary elements of S) is called a linear manifold spanned by S and is symbolized by $s p$(S). If we add to $s p$(S) all its limit points we obtain a closed set that we call the closed linear manifold or subspace spanned by S, symbolized by $s p ¯$(S). Two elements $z , w$$L 2$ are called orthogonal, and we write $z ⊥ w$, if $〈 z , w 〉 = 0 .$ If S is any subset of $L 2 ( Ω , F , P )$, then we write $x ⊥ S$ if $x ⊥ s$ for all $s ∈ S ;$ similarly, the notation $S ⊥ T$, for two subsets S and T of $L 2 ( Ω , F , P )$, indicates that all elements of S are orthogonal to all elements of T. For a given $z ∈ L 2 ( Ω , F , P )$ and a closed subspace M of $L 2 ( Ω , F , P )$, we define the orthogonal projection of z on M, denoted by $P ( z | M )$, as the unique element of $M$such that $z − P ( z | M ) ≤ z − w$ for any $w ∈ M .$ We remember that if $z ⊥ M$, then $P ( z | M ) = 0 .$
If M and N are two arbitrary subsets of $L 2 ( Ω , F , P )$, then the quantity
$d M , N = inf m − n ; m ∈ M , n ∈ N$
is called distance between M and N.
We close this section introducing some further definitions, concerning discrete stochastic processes in $L 2 ( Ω , F , P )$.
Let $x t$ be a univariate stochastic process. We say that $x t$ is integrated of order one (denoted $x t ∼ I ( 1 )$) if the process $Δ x t = x t − x t − 1$ is stationary whereas $x t$ is not stationary. We say that the bivariate stochastic process $y t = y 1 , t , y 2 , t ′$ is integrated of order one if $y 1 , t ∼ I ( 1 )$ and $y 2 , t ∼ I ( 1 )$.
A stochastic process $y t$ Granger causes another stochastic process $x t$, with respect to a given information set $I t$ that contains at least $x t − j$, $y t − j$, $j > 0$, if $x t$ can be better predicted by using past values of y than by not doing so, all other information in $I t$ (including the past of x) being used in either case. More formally, we say that $y t$ is Granger causal for $x t$ with respect to $H x y ( t ) = s p ¯ x t , y t , x t − 1 , y t − 1 , . . .$ if
$x t + 1 − P x t + 1 | H x y ( t ) 2 < x t + 1 − P x t + 1 | H x ( t ) 2$
where $H x ( t ) = s p ¯ x t , x t − 1 , . . .$.
Two stochastic processes, $x t$, and $y t$, both of which are individually $I ( 1 ) ,$ are said to be cointegrated if there exists a non-zero constant β such that $z t = x t − β y t$ is a stationary ($I ( 0 )$) process.
It is important to note that cointegration between two variables implies the existence of causality (in the Granger sense) between them in at least one direction (see Granger [4]).

## 3. A Geometric Characterization

In this section we assume that $y t = y 1 , t , y 2 , t ′ ; t = 0 , 1 , . . .$ be a bivariate stochastic process defined on $( Ω , F , P ) ,$ integrated of order one, with $y 1 , 0 = y 2 , 0 = 0 ,$ that has a VAR(1) representation
$y t = Ay t − 1 + u t$
where
$A = a 11 a 12 a 21 a 22$
is a fixed $( 2 × 2 )$ coefficient matrix and $u t = u 1 , t , u 2 , t ′$ is i.i.d. with $E u t = 0$ and $E ( u t u t ′ ) = Σ = σ 1 2 0 0 σ 2 2$ for all t and $E ( u t u s ′ ) = 0$ for $s ≠ t$.
In this framework we have that $y 2 , t$ does not Granger cause $y 1 , t$ if and only if $a 12 = 0 .$ Similarly, $y 1 , t$ does not Granger cause $y 2 , t$ if and only if $a 21 = 0$.
We observe that the VAR residuals are usually correlated and hence the covariance matrix Σ is seldom a diagonal matrix. However, because the main aim of this study is pedagogical, we assume that Σ is diagonal for analytical convenience.
We consider the following information sets: $I Δ y 1 ( t + ) = Δ y 1 , t + 1 , Δ y 1 , t + 2 , . . .$, $I Δ y 2 ( t + ) = Δ y 2 , t + 1 , Δ y 2 , t + 2 , . . .$, $H Δ y 1 ( t ) = s p ¯ Δ y 1 , t , Δ y 1 , t − 1 , . . .$ and $H Δ y 2 ( t ) = s p ¯ Δ y 2 , t , Δ y 2 , t − 1 , . . .$.
Theorem 3.1. Let $y t$ be a VAR(1) process defined as in (2). The differenced process $Δ y t ; t = 1 , . . .$ is jointly unpredictable if and only if
$d I Δ y 1 ( t + ) , H Δ y 2 ( t ) = σ Δ y 1 and d I Δ y 2 ( t + ) , H Δ y 1 ( t ) = σ Δ y 2$
Theorem 1 provides a geometric characterization of the notion of joint unpredictability of a bivariate process in term of distance between information sets. It is important to note that
$d I Δ y 1 ( t + ) , H Δ y 2 ( t ) ≤ σ Δ y 1 and d I Δ y 2 ( t + ) , H Δ y 1 ( t ) ≤ σ Δ y 2$
Thus we have that the process $Δ y t = [ Δ y 1 , t , Δ y 2 , t ] ′ ; t = 1 , . . .$ is jointly unpredictable if and only if the distances $d I Δ y 1 ( t + ) , H Δ y 2 ( t )$ and $d I Δ y 2 ( t + ) , H Δ y 1 ( t )$ achieve their maximum value, respectively.
It is intuitive to think that if these distances achieve their maximum value, then $σ ( y t , . . . , y 0 )$ does not contain any valuable information about the future of the differenced series, $Δ y t = [ Δ y 1 , t , Δ y 2 , t ] ′$ and hence these are jointly unpredictable with respect to the information set $σ ( y t , . . . , y 0 )$, that is E$( Δ y t + 1 | σ ( y t , . . . , y 0 ) ) = 0$.
We recall that Theorem 1 holds only in a bivariate setting.

#### 3.1. Lemmas

In order to prove Theorem 1, we need the following lemmas.
Lemma 3.2.
Let V be a closed subspace of $L 2 ( Ω , F , P )$ and $G ≠ ∅$ a subset of $L 2 ( Ω , F , P )$ such that $g = η ∈ R$, $∀ g ∈ G$. $G ⊥ V$ if and only if $d G , V = η .$
Proof.
Focker and Triacca ([5], p. 767).
Lemma 1 establishes a relationship between the orthogonality of sets/spaces in the Hilbert space $L 2 ( Ω , F , P )$ and their distance. We note that the orthogonality between G and V holds if and only if the distance $d G , V$ achieves the maximum value. In fact, $d G , V$ can not be greater than η since $0 ∈$ V.
Lemma 3.3.
The processes $y 1 , t$ and $y 2 , t$ are not cointegrated if and only if $A = I .$
Proof.
By (2) we have
$Δ y 1 , t Δ y 1 , t = a 11 − 1 a 12 a 21 a 22 − 1 y 1 , t − 1 y 2 , t − 1 + u 1 , t u 2 , t$
These equations must be balanced, that is the order of integration of $( a 11 − 1 ) y 1 , t − 1 + a 12 y 2 , t − 1$ and $a 21 y 1 , t − 1 + ( a 22 − 1 ) y 2 , t − 1$ must be zero.
(⇒) If $A ≠ I ,$ since $( a 11 − 1 ) y 1 , t − 1 + a 12 y 2 , t − 1 ∼ I ( 0 )$ and $a 21 y 1 , t − 1 + ( a 22 − 1 ) y 2 , t − 1 ∼ I ( 0 )$, we can have three cases.
Case (1) $A = [ a i j ] ,$with $a i j ≠ 0 i , j = 1 , 2 , i ≠ j$ and $a i i ≠ 1 i = 1 , 2 .$
Case (2)
$A = a 11 a 12 0 1$
with $a 11 ≠ 1$ and $a 21 ≠ 0$.
Case (3)
$A = 1 0 a 21 a 22$
with $a 21 ≠ 0$ and $a 22 ≠ 1$.
In all three cases, there exists at least a not trivial linear combination of the processes $y 1 , t$ and $y 2 , t$ that is stationary. Thus we can conclude that $y 1 , t$ and $y 2 , t$ are cointegrated.
(⇐) If $A = I ,$ then $a 12 = a 21 = 0$ and so $y 1 , t$ does not Granger cause $y 2 , t$ and $y 2 , t$ does not Granger cause $y 1 , t$. It follows that $y 1 , t$ and $y 2 , t$ are not cointegrated.
Lemma 3.4.
If $y 1 , t$ and $y 2 , t$ are cointegrated, then $a 11 + a 22 − 1 < 1 .$
Proof.
We subtract $y 1 , t − 1 , y 2 , t − 1 ′$ from both sides of Equation (2) by obtaining
$Δ y 1 , t Δ y 2 , t = a 11 − 1 a 12 a 21 a 22 − 1 y 1 , t − 1 y 2 , t − 1 + u 1 , t u 2 , t$
If $y 1 , t$ and $y 2 , t$ are cointegrated, we have
$Δ y 1 , t Δ y 2 , t = α 1 α 2 β 1 β 2 y 1 , t − 1 y 2 , t − 1 + u 1 , t u 2 , t = β 1 α 1 α 2 1 β 2 / β 1 y 1 , t − 1 y 2 , t − 1 + u 1 , t u 2 , t = ϑ 1 ϑ 2 1 − β y 1 , t − 1 y 2 , t − 1 + u 1 , t u 2 , t = ϑ 1 ϑ 2 ( y 1 , t − 1 − β y 2 , t − 1 ) + u 1 , t u 2 , t$
where $β = − β 2 / β 1$ is the cointegration coefficient and $ϑ 1 = β 1 α 1$ and $ϑ 2 = β 1 α 2$ are the speed of adjustment coefficients.
We observe that
$Δ y 1 , t − β Δ y 2 , t = ϑ 1 y 1 , t − 1 − β ϑ 2 y 1 , t − 1 − β ϑ 1 y 2 , t − 1 + β 2 ϑ 2 y 2 , t − 1 + u 1 , t − β u 2 , t$
By rearranging Equation (3) we obtain an AR(1) model for $y 1 , t − β y 2 , t :$
$y 1 , t − β y 2 , t = δ ( y 1 , t − 1 − β y 2 , t − 1 ) + u 1 , t − β u 2 , t$
where $δ = 1 + ϑ 1 − β ϑ 2 = a 11 + a 22 − 1 .$ Since $y 1 , t$ and $y 2 , t$ are cointegrated, $y 1 , t − β y 2 , t$ is a stationary process and so
$a 11 + a 22 − 1 < 1$
Lemma 3.5.
The process $Δ y t = Δ y 1 , t , Δ y 2 , t ′ ; t = 0 , 1 , . . .$ is jointly unpredictable if and only if
$A = I$
Proof.
(⇒) process $Δ y t = Δ y 1 , t , Δ y 2 , t ′ ; t = 0 , 1 , . . .$ is jointly unpredictable, then
$E ( y t | σ ( y t − 1 , . . . , y 1 ) ) = y t − 1$
On the other hand, since
$y t = Ay t − 1 + u t$
with $E u t = 0$ and $E ( u t u t ′ ) = Σ$ for all t and $E ( u t u s ′ ) = 0$ for $s ≠ t ,$ we have that
$E ( y t | σ ( y t − 1 , . . . , y 1 ) ) = Ay t − 1$
Hence we have
$Ay t − 1 = y t − 1$
and so
$A = I$
(⇐) If
$A = I$
then $y t = y t − 1 + u t$ with $E u t = 0$ and $E ( u t u t ′ ) = Σ$ for all t and $E ( u t u s ′ ) = 0$ for $s ≠ t$, and hence we have
$E ( y t | σ ( y t − 1 , . . . , y 0 ) ) = y t − 1$
Thus we can conclude that the process $Δ y t = Δ y 1 , t , Δ y 2 , t ′ ; t = 1 , . . .$ is jointly unpredictable.
Before to conclude this subsection we observe that Equation (2) can be written in lag operator notation. The lag operator L is defined such that $L y t = y t − 1$. We have that
$( I − A L ) y t = u t$
or
$1 − a 11 L − a 12 L − a 21 L 1 − a 22 L y 1 , t y 2 , t = u 1 , t u 2 , t$

#### 3.2. Proof of Theorem 1

Sufficiency. If
$d I Δ y 1 ( t + ) , H Δ y 2 ( t ) = σ Δ y 1 and d I Δ y 2 ( t + ) , H Δ y 1 ( t ) = σ Δ y 2$
then, by Lemma 1, we have
$I Δ y 1 ( t + ) ⊥ H Δ y 2 ( t )$
and
$I Δ y 2 ( t + ) ⊥ H Δ y 1 ( t )$
Now we assume that $a 12$ and $a 21$ are not both equal to zero. We can have three cases.
Case (1) $a 12 ≠ 0$ and $a 21 = 0 .$ This implies that
$r 1 , t = ( a 11 − 1 ) y 1 , t + a 12 y 2 , t + u 1 , t$
and
$Δ y 2 , t = u 2 , t$
Thus
$< Δ y 1 , t + 1 , Δ y 2 , t > = E ( Δ y 1 , t + 1 Δ y 2 , t ) = ( a 11 − 1 ) E ( y 1 , t u 2 , t ) + a 12 E ( y 2 , t u 2 , t ) + E ( u 1 , t + 1 u 2 , t ) = ( a 11 − 1 ) E ( y 1 , t u 2 , t ) + a 12 E ( u 2 , t ∑ s = 1 t u 2 , s ) = ( a 11 − 1 ) E ( y 1 , t u 2 , t ) + a 12 σ 2 2$
Now, we note that
$E ( y 1 , t u 2 , t ) = a 11 t E ( y 1 , 0 u 2 , t ) = 0$
Thus
$< Δ y 1 , t + 1 , Δ y 2 , t > = a 12 σ 2 2 ≠ 0$
but this is absurd since
$I Δ y 1 ( t + ) ⊥ H Δ y 2 ( t )$
Case (2) $a 12 = 0$ and $a 21 ≠ 0 .$ In this case we have
$< Δ y 2 , t + 1 , Δ y 1 , t > = a 21 σ 1 2 ≠ 0$
Again this is absurd since
$I Δ y 2 ( t + ) ⊥ H Δ y 1 ( t )$
Case (3) $a 12 ≠ 0$ and $a 21 ≠ 0 .$ We note that
$Δ y 1 , t Δ y 2 , t = ( 1 − a 22 L ) γ ( L ) a 12 L γ ( L ) a 21 L γ ( L ) ( 1 − a 11 L ) γ ( L ) u 1 , t u 2 , t$
where
$γ ( L ) = 1 − L ( 1 − a 11 L ) ( 1 − a 22 L ) − a 12 a 21 L 2$
By Lemma 2, we have that $y 1 t$ and $y 2 t$ are cointegrated and hence the matrix
$A − I = a 11 − 1 a 12 a 21 a 22 − 1$
has rank 1. It follows that
$a 12 a 21 = ( 1 − a 11 ) ( 1 − a 22 )$
Thus
$γ ( L ) = 1 − L ( 1 − a 11 L ) ( 1 − a 22 L ) − ( 1 − a 11 ) ( 1 − a 22 ) L 2 = 1 − L ( 1 − L ) ( 1 + L ) − ( 1 − L ) ( a 11 + a 22 ) L = 1 1 + L − ( a 11 + a 22 ) L = 1 1 − ( a 11 + a 22 − 1 ) L = 1 1 − δ L$
where $δ = a 11 + a 22 − 1 .$
Since $y 1 t$ and $y 2 t$ are cointegrated, by Lemma 3 we have that $δ < 1$ and hence
$γ ( L ) = 1 + δ L + δ 2 L 2 + . . .$
Now, we can have two cases.
Case (a) $δ = 0 .$ In this case we have
$Δ y 1 , t = u 1 , t − a 22 u 1 , t − 1 + a 12 u 2 , t − 1$
and
$Δ y 2 , t = a 21 u 1 , t − 1 + u 2 , t − a 11 u 2 , t − 1$
Thus
$< Δ y 1 , t + 1 , Δ y 2 , t > = a 12 σ 2 2 ≠ 0$
and
$< Δ y 2 , t + 1 , Δ y 1 , t > = a 21 σ 1 2 ≠ 0$
but this is absurd since
$I Δ y 1 ( t + ) ⊥ H Δ y 2 ( t )$
and
$I Δ y 2 ( t + ) ⊥ H Δ y 1 ( t )$
Case (b) $δ ≠ 0 .$ In this case we have
$Δ y 1 , t = u 1 , t + ( a 11 − 1 ) u 1 , t − 1 + δ ( a 11 − 1 ) u 1 , t − 2 + . . . a 12 u 2 , t − 1 + a 12 δ u 2 , t − 2 + . . . = u 1 , t + ( a 11 − 1 ) ∑ i = 0 ∞ δ i u 1 , t − 1 − i + a 12 ∑ i = 0 ∞ δ i u 2 , t − 1 − i$
and
$Δ y 2 , t = u 2 , t + ( a 22 − 1 ) u 2 , t − 1 + δ ( a 22 − 1 ) u 2 , t − 2 + . . . a 21 u 1 , t − 1 + a 21 δ u 1 , t − 2 + . . . = u 2 , t + ( a 22 − 1 ) ∑ i = 0 ∞ δ i u 2 , t − 1 − i + a 21 ∑ i = 0 ∞ δ i u 1 , t − 1 − i$
Thus
$< Δ y 1 , t + 1 , Δ y 2 , t > = ( a 11 − 1 ) σ 1 2 δ 1 − δ 2 a 21 + 1 + ( a 22 − 1 ) δ 1 − δ 2 σ 2 2 a 12$
and
$< Δ y 2 , t + 1 , Δ y 1 , t > = ( a 22 − 1 ) σ 2 2 δ 1 − δ 2 a 12 + 1 + ( a 11 − 1 ) δ 1 − δ 2 σ 1 2 a 21$
Now, we consider the system
$1 + ( a 22 − 1 ) δ 1 − δ 2 σ 2 2 ( a 11 − 1 ) σ 1 2 δ 1 − δ 2 ( a 22 − 1 ) σ 2 2 δ 1 − δ 2 1 + ( a 11 − 1 ) δ 1 − δ 2 σ 1 2 a 12 a 21 = 0 0$
The determinant of the matrix
$1 + ( a 22 − 1 ) δ 1 − δ 2 σ 2 2 ( a 11 − 1 ) σ 1 2 δ 1 − δ 2 ( a 22 − 1 ) σ 2 2 δ 1 − δ 2 1 + ( a 11 − 1 ) δ 1 − δ 2 σ 1 2$
is
$σ 1 2 σ 2 2 1 − δ 1 + δ$
Since $σ 1 2 σ 2 2 > 0$ and $δ 1 + δ ≠ 1 ,$ we have that
$σ 1 2 σ 2 2 1 − δ 1 + δ ≠ 0$
Thus $a 12 ≠ 0 ,$ $a 21 ≠ 0$ implies that $< Δ y 1 , t + 1 , Δ y 2 , t > ≠ 0$ or $< Δ y 2 , t + 1 , Δ y 1 , t > ≠ 0 ,$ but this is absurd since
$I Δ y 1 ( t + ) ⊥ H Δ y 2 ( t )$
and
$I Δ y 2 ( t + ) ⊥ H Δ y 1 ( t )$
In all Cases (1–3) we obtain an absurd conclusion, thus we can state that
$a 12 = 0 , a 21 = 0$
Now, we prove that $a 11 = a 22 = 1$. We have that
$Δ y i , t = ( a i i − 1 ) y i , t + u i t i = 1 , 2$
Since the error term $u t = u 1 , t , u 2 , t ′$ is stationary these equations must be balanced, that is the order of integration of $Δ y i , t$ and $( a i i − 1 ) y i , t$ must be the same. By the hypothesis that $y i , t ∼ I ( 1 ) ,$ it follows that $Δ y i t ∼ I ( 0 )$ (i.e., stationary) and $( a i i − 1 ) y i , t$ is I(1), hence $Δ y i , t = ( a i i − 1 ) y i , t + u i , t$ $i = 1 , 2$ implies that $a 11 = a 22 = 1 .$ Thus $A = I$ and hence, by Lemma 4, it follows that the process $Δ y t ; t = 1 , . . .$ is jointly unpredictable.
Necessity. If the process $Δ y t ; t = 1 , . . .$ is is jointly unpredictable, then by Lemma 4 it follows that $A = I$ and hence $Δ y 1 , t = u 1 , t$ and $Δ y 2 , t = u 2 , t$ $∀ t$. This implies that $P ( Δ y 1 , t + h | H Δ y 2 ( t ) ) = 0$ and $P ( Δ y 2 , t + h | H Δ y 1 ( t ) ) = 0$ $∀ h > 0$. Therefore we have that $Δ y 1 , t + h ⊥ H Δ y 2 ( t )$ and $Δ y 2 , t + h ⊥ H Δ y 1 ( t )$ $∀ h > 0$. Thus, by Lemma 1, it follows that
$d I Δ y 1 ( t + ) , H Δ y 2 ( t ) = σ Δ y 1 and d I Δ y 2 ( t + ) , H Δ y 1 ( t ) = σ Δ y 2$
Theorem 1 is proved.

## 4. Conclusions

In this paper we have considered the following geometric condition concerning the distance between information sets
$d I Δ y 1 ( t + ) , H Δ y 2 ( t ) = σ Δ y 1 and d I Δ y 2 ( t + ) , H Δ y 1 ( t ) = σ Δ y 2$
It says that the distances $d I Δ y 1 ( t + ) , H Δ y 2 ( t )$ and $d I Δ y 2 ( t + ) , H Δ y 1 ( t )$ achieve their maximum value, respectively. Theorem 1 tells us that, under the hypothesis that the process $y t$ follows a bivariate VAR(1) model, the condition Equation (4) represents a geometric characterization of the notion of joint unpredictability. If this condition holds, the processes $Δ y 1$ and $Δ y 2$ are jointly unpredictable since the past of the bivariate process $y t$ does not contain any valuable information about the future of the differenced series. The information in the past is too far from the future information.
Even if the bivariate VAR(1) assumption is far from general, we think that this geometric characterization is useful in order to throw light on the concept of joint unpredictability of a stochastic process.