Next Article in Journal
How to Determine Losses in a Flow Field: A Paradigm Shift towards the Second Law Analysis
Next Article in Special Issue
Asymptotically Constant-Risk Predictive Densities When the Distributions of Data and Target Variables Are Different
Previous Article in Journal
Reaction Kinetics Path Based on Entropy Production Rate and Its Relevance to Low-Dimensional Manifolds
Previous Article in Special Issue
F-Geometry and Amari’s α-Geometry on a Statistical Manifold

Entropy 2014, 16(6), 2944-2958; doi:10.3390/e16062944

Article
Information Geometric Complexity of a Trivariate Gaussian Statistical Model
Domenico Felice 1,2,*, Carlo Cafaro 3 and Stefano Mancini 1,2
1
School of Science and Technology, University of Camerino, I-62032 Camerino, Italy; E-Mail: stefano.mancini@unicam.it
2
INFN-Sezione di Perugia, Via A. Pascoli, I-06123 Perugia, Italy
3
Department of Mathematics, Clarkson University, Potsdam, 13699 NY, USA; E-Mail: carlocafaro2000@yahoo.it
*
Author to whom correspondence should be addressed; E-Mail: domenico.felice@unicam.it
Received: 1 April 2014; in revised form: 21 May 2014 / Accepted: 22 May 2014 /
Published: 26 May 2014

Abstract

: We evaluate the information geometric complexity of entropic motion on low-dimensional Gaussian statistical manifolds in order to quantify how difficult it is to make macroscopic predictions about systems in the presence of limited information. Specifically, we observe that the complexity of such entropic inferences not only depends on the amount of available pieces of information but also on the manner in which such pieces are correlated. Finally, we uncover that, for certain correlational structures, the impossibility of reaching the most favorable configuration from an entropic inference viewpoint seems to lead to an information geometric analog of the well-known frustration effect that occurs in statistical physics.
Keywords:
probability theory; Riemannian geometry; complexity

1. Introduction

One of the main efforts in physics is modeling and predicting natural phenomena using relevant information about the system under consideration. Theoretical physics has had a general measure of the uncertainty associated with the behavior of a probabilistic process for more than 100 years: the Shannon entropy [1]. The Shannon information theory was applied to dynamical systems and became successful in describing their unpredictability [2].

Along a similar avenue we may set Entropic Dynamics [3] which makes use of inductive inference (Maximum Entropy Methods [4]) and Information Geometry [5]. This is clearly remarkable given that microscopic dynamics can be far removed from the phenomena of interest, such as in complex biological or ecological systems. Extension of ED to temporally-complex dynamical systems on curved statistical manifolds led to relevant measures of chaoticity [6]. In particular, an information geometric approach to chaos (IGAC) has been pursued studying chaos in informational geodesic flows describing physical, biological or chemical systems. It is the information geometric analogue of conventional geometrodynamical approaches [7] where the classical configuration space is being replaced by a statistical manifold with the additional possibility of considering chaotic dynamics arising from non conformally flat metrics. Within this framework, it seems natural to consider as a complexity measure the (time average) statistical volume explored by geodesic flows, namely an Information Geometry Complexity (IGC).

This quantity might help uncover connections between microscopic dynamics and experimentally observable macroscopic dynamics which is a fundamental issue in physics [8]. An interesting manifestation of such a relationship appears in the study of the effects of microscopic external noise (noise imposed on the microscopic variables of the system) on the observed collective motion (macroscopic variables) of a globally coupled map [9]. These effects are quantified in terms of the complexity of the collective motion. Furthermore, it turns out that noise at a microscopic level reduces the complexity of the macroscopic motion, which in turn is characterized by the number of effective degrees of freedom of the system.

The investigation of the macroscopic behavior of complex systems in terms of the underlying statistical structure of its microscopic degrees of freedom also reveals effects due to the presence of microcorrelations [10]. In this article we first show which macro-states should be considered in a Gaussian statistical model in order to have a reduction in time of the Information Geometry Complexity. Then, dealing with correlated bivariate and trivariate Gaussian statistical models, the ratio between the IGC in the presence and in the absence of microcorrelations is explicitly computed, finding an intriguing, even though non yet deep understood, connection with the phenomenon of geometric frustration [11].

The layout of the article is as follows. In Section 2 we introduce a general statistical model discussing its geometry and describing both its dynamics and information geometry complexity. In Section 3, Gaussian statistical models (up to a trivariate model) are considered. There, we compute the asymptotic temporal behaviors of their IGCs. Finally, in Section 4 we draw our conclusions by outlining our findings and proposing possible further investigations.

2. Statistical Models and Information Geometry Complexity

Given n real-valued random variables X1,…, Xn defined on the sample space Ω with joint probability density p: ℝn → ℝ satisfying the conditions

p ( x ) 0 ( x n ) and n d x p ( x ) = 1 ,

let us consider a family P of such distributions and suppose that they can be parametrized using m real-valued variables (θ1,…, θm) so that

P = { p θ = p ( x | θ ) | θ = ( θ 1 , , θ m ) Θ } ,

where Θ ⊆ ℝm is the parameter space and the mapping θpθ is injective. In such a way, P is an m-dimensional statistical model on ℝn.

The mapping φ : P R m defined by φ(pθ) = θ allows us to consider φ = [θi] as a coordinate system for P. Assuming parametrizations which are C, we can turn P into a C differentiable manifold (thus, P is called statistical manifold) [5].

The values x1,…, xn taken by the random variables define the micro-state of the system, while the values θ1,…, θm taken by parameters define the macro-state of the system.

Let P = { p θ | θ Θ } be an m-dimensional statistical model. Given a point θ, the Fisher information matrix of P in θ is the m × m matrix G(θ) = [gij], where the (i, j) entry is defined by

g i j ( θ ) : = n d x p ( x | θ ) i log p ( x | θ ) j log p ( x | θ ) ,

with i standing for θ i. The matrix G(θ) is symmetric, positive semidefinite and determines a Riemannian metric on the parameter space Θ [5]. Hence, it is possible to define a Riemannian statistical manifold : = ( Θ , g ), where g = gijidθj (i, j = 1,…, m) is the metric whose components gij are given by Equation (3) (throughout the paper we use the Einstein sum convention).

Given the Riemannian manifold = ( Θ , g ), it is well known that there exists only one linear connection ∇(the Levi–Civita connection) on that is compatible with the metric g and symmetric [12]. We remark that the manifold has one chart, being Θ an open set of ℝm, and the Levi-Civita connection is uniquely defined by means of the Christoffel coefficients

Γ i j k = 1 2 g k l ( g l j θ i + g i l θ j g i j θ l ) , ( i , j , k = 1 , , m )

where gkl is the (k, l) entry of the inverse of the Fisher matrix G(θ).

The idea of curvature is the fundamental tool to understand the geometry of the manifold = ( Θ , g ). Actually, it is the basic geometric invariant and the intrinsic way to obtain it is by means of geodesics. It is well-known, that given any point θ and any vector υ tangent to at θ, there is a unique geodesic starting at θ with initial tangent vector υ. Indeed, within the considered coordinate system, the geodesics are solutions of the following nonlinear second order coupled ordinary differential equations [12]

d 2 θ k d τ 2 + Γ i j k d θ i d τ d θ j d τ = 0 ,

with τ denoting the time.

The recipe to compute some curvatures at a point θ is the following: first, select a 2-dimensional subspace Π of the tangent space to at θ; second, follow the geodesics through θ whose initial tangent vectors lie in Π and consider the 2-dimensional submanifolds SΠ swiped out by them inheriting a Riemannian metric from ; finally, compute the Gaussian curvature of SΠ at θ, which can be obtained from its Riemannian metric as stated in the Theorema Egregium [13]. The number K(Π) found in such manner is called the sectional curvature of at θ associated with the plane Π. In terms of local coordinates, to compute the sectional curvature we need the curvature tensor,

R i j k h = Γ j k h θ i Γ i k h θ j + Γ j k l Γ i l h Γ i k l Γ j l h .

For any basis (ξ, η) for a 2-plane Π T θ , the sectional curvature at θ is given by [12]

K ( ξ , η ) = R ( ξ , η , η , ξ ) | ξ | 2 | η | 2 ξ , η ,

where R is the Riemann curvature tensor which is written in coordinates as R = Rijklijkl with R i j k l = g l h R i j k h and ⟨, ⟩ is the inner product defined by the metric g.

The sectional curvature is directly related to the topology of the manifold; along this direction the Cartan-Hadamard Theorem [13] is enlightening by stating that any complete, simply connected n-dimensional manifold with non positive sectional curvature is diffeomorphic to ℝn.

We can consider upon the statistical manifold = ( Θ , g ) the macro-variables θ as accessible information and then derive the information dynamical Equation (5) from a standard principle of least action of Jacobi type [3]. The geodesic Equations (5) describe a reversible dynamics whose solution is the trajectory between an initial and a final macrostate θinitial and θfinal, respectively. The trajectory can be equally traversed in both directions [10]. Actually, an equation relating instability with geometry exists and it makes hope that some global information about the average degree of instability (chaos) of the dynamics is encoded in global properties of the statistical manifolds [7]. The fact that this might happen is proved by the special case of constant-curvature manifolds, for which the Jacobi-Levi-Civita equation simplifies to [7]

d 2 J i d τ 2 + K J i = 0 ,

where K is the constant sectional curvature of the manifold (see Equation (7)) and J is the geodesic deviation vector field. On a positively curved manifold, the norm of the separating vector J does not grow, whereas on a negatively curved manifold, the norm of J grows exponentially in time, and if the manifold is compact, so that its geodesic are sooner or later obliged to fold, this provide an example of chaotic geodesic motion [14].

Taking into consideration these facts, we single out as suitable indicator of dynamical (temporal) complexity, the information geometric complexity defined as the average dynamical statistical volume [15]

v o l ˜ [ D Θ ( geodesic ) ( τ ) ] : = 1 τ 0 τ d τ v o l [ D Θ ( geodesic ) ( τ ) ] ,

where

v o l [ D Θ ( geodesic ) ( τ ) ] : = D Θ ( geodesic ) ( τ ) det ( G ( θ ) ) d θ ,

with G(θ) the information matrix whose components are given by Equation (3). The integration space D Θ ( geodesic ) ( τ ) is defined as follows

D Θ ( geodesic ) ( τ ) : = { θ = ( θ 1 , , θ m ) : θ k ( 0 ) θ k θ k ( τ ) } ,

where θk = θk(s) with 0 ≤ sτ such that θk(s) satisfies (5). The quantity v o l [ D Θ ( geodesic ) ( τ ) ] is the volume of the effective parameter space explored by the system at time τ′. The temporal average has been introduced in order to average out the possibly very complex fine details of the entropic dynamical description of the system’s complexity dynamics.

Relevant properties, concerning complexity of geodesic paths on curved statistical manifolds, of the quantity (10) compared to the Jacobi vector field are discussed in [16].

3. The Gaussian Statistical Model

In the following we devote our attention to a Gaussian statistical model P whose element are multivariate normal joint distributions for n real-valued variables X1,…, Xn given by

p ( x | θ ) = 1 ( 2 π ) n det C exp [ 1 2 ( x μ ) t C 1 ( x μ ) ] ,

where μ = ( E(X1),…, E(Xn)) is the n-dimensional mean vector and C denotes the n × n covariance matrix with entries cij = E(XiXj) − E(Xi) E(Xj), i, j = 1,…,n. Since μ is a n-dimensional real vector and C is a n × n symmetric matrix, the parameters involved in this model should be n + n ( n + 1 ) 2. Moreover C is a symmetric, positive definite matrix, hence we have the parameter space given by

Θ : = { ( μ , C ) | μ n , C n × n , C > 0 } .

Hereafter we consider the statistical model given by Equation (12) when the covariance matrix C has only variances σ i 2 = E ( X i 2 ) ( E ( X i ) ) 2 as parameters. In fact we assume that the non diagonal entry (i, j) of the covariance matrix C equals ρσiσj with ρ ∈ ℝ quantifying the degree of correlation.

We may further notice that the function fij(x):= ∂i log p(x|θ)j log p(x|θ), when p(x|θ) is given by Equation (12), is a polynomial in the variables xi (i = 1,…, n) whose degree is not grater than four. Indeed, we have that

i log p ( x | θ ) = 1 p ( x | θ ) i p ( x | θ ) = i 1 ( 2 π ) n det C + i [ 1 2 ( x μ ) t C 1 ( x μ ) ] ,

and, therefore, the differentiation does not affect variables xi. With this in mind, in order to compute the integral in (3), we can use the following formula [17]

1 ( 2 π ) n det C d x f i j ( x ) exp [ 1 2 ( x μ ) t C 1 ( x μ ) ] = exp [ 1 2 h , k = 1 n c h k x h x k ] f i j | x = μ ,

where the exponential denotes the power series over its argument (the differential operator).

3.1. The monovariate Gaussian Statistical Model

We now start to apply the concepts of the previous section to a Gaussian statistical model of Equation (12) for n = 1. In this case, the dimension of the statistical Riemannian manifold = ( Θ , g ) is at most two. Indeed, to describe elements of the statistical model P given by Equation (12), we basically need the mean μ = E(X) and variance σ2 = E(Xμ)2. We deal separately with the cases when the monovariate model has only μ as macro-variable (Case 1), when σ is the unique macro-variable (Case 2), and finally when both μ and σ are macro-variables (Case 3).

3.1.1. Case 1

Consider the monovariate model with only μ as macro-variable by setting σ = 1. In this case the manifold is trivially the real flat straight line, since μ ∈ (−∞, +∞). Indeed, the integral in (3) is equal to 1 when the distribution p(x|θ) reads as p ( x | μ ) = exp [ 1 2 ( x μ ) 2 ] 2 π; so the metric is g = 2. Furthermore, from Equations (4) and (5) the information dynamics is described by the geodesic μ(τ) = a1τ + A2, where A1, A2 ∈ ℝ. Hence, the volume of Equation (10) results v o l [ D Θ ( geodesic ) ( τ ) ] = d μ = A 1 τ + A 2; since this quantity must be positive we assume Ai, A2 > 0. Finally, the asymptotic behavior of the IGC (9) is

v o l ˜ [ D Θ ( geodesic ) ( τ ) ] ( A 1 2 ) τ .

This shows that the complexity linearly increases in time meaning that acquiring information about μ and updating it, is not enough to increase our knowledge about the micro state of the system.

3.1.2. Case 2

Consider now the monovariate Gaussian statistical model of Equation (12) when μ = E(X) = 0 and the macro-variable is only σ. In this case the probability distribution function reads p ( x | σ ) = exp [ x 2 2 σ 2 ] 2 π σ while the Fisher–Rao metric becomes g = 2 σ 2 d σ 2. Emphasizing that also in this case the manifold is flat as well, we derive the information dynamics by means of Equations (4) and (5) and we obtain the geodesic σ(τ) = A1 exp [A2τ]. The volume in Equation (10) then results

v o l [ D Θ ( geodesic ) ( τ ) ] = 2 σ d σ = 2 log [ A 1 exp [ A 2 τ ] ] .

Again, to have positive volume we have to assume A1, A2 > 0. Finally, the (asymptotic) IGC (9) becomes

v o l ˜ [ D Θ ( geodesic ) ( τ ) ] ( 2 A 2 2 ) τ .

This shows that also in this case the complexity linearly increases in time meaning that acquiring information about σ and updating it, is not enough to increase our knowledge about the micro-state of the system.

3.1.3. Case 3

The take home message of the previous cases is that we have to account for both mean μ and variance σ as macro-variables to look for possible non increasing complexity. Hence, consider the probability distribution function is given by,

p ( x 1 , x 2 | μ , σ ) = exp [ 1 2 ( x μ ) 2 σ 2 ] σ 2 π .

The dimension of the Riemannian manifold = ( Θ , g ) is two, where the parameter space Θ is given by Θ = {(μ, σ)|μ ∈ (−∞,+∞),σ > 0} and the Fisher–Rao metric reads as g = 1 σ 2 d μ 2 + 2 σ 2 d σ 2. Here, the sectional curvature given by Equation (7) is a negative function and despite the fact that is not constant, we expect a decreasing behavior in time of the IGC. Thanks to Equation (4), we find that the only non negative Christoffel coefficients are Γ 12 1 = 1 σ, Γ 11 2 = 1 2 σ and Γ 22 2 = 1 σ. Substituting them into Equation (5) we derive the following geodesic equations

Entropy 16 02944f2

The integration of the above coupled differential equations is non-trivial. We follow the method described in [10] and arrive at

Entropy 16 02944f3

where σ0 and A1 are real constants. Then, using (21), the volume of Equation (10) results

Entropy 16 02944f4

Since the last quantity must be positive, we assume A1 > 0. Finally, employing the above expression into Equation (9) we arrive at

v o l ˜ [ D Θ ( geodesic ) ( τ ) ] ( 2 σ 0 A 1 ) 1 τ .

We can now see a reduction in time of the complexity meaning that acquiring information about both μ and σ and updating them allows us to increase our knowledge about the micro state of the system.

Hence, comparing Equations (16), (18) and (23) we conclude that the entropic inferences on a Gaussian distributed micro-variable is carried out in a more efficient manner when both its mean and the variance in the form of information constraints are available. Macroscopic predictions when only one of these pieces of information are available are more complex.

3.2. Bivariate Gaussian Statistical Model

Consider now the Gaussian statistical model P of the Equation (12) when n = 2. In this case the dimension of the Riemannian manifold = ( Θ , g ) is at most four. From the analysis of the monovariate Gaussian model in Section 3.1 we have understood that both mean and variance should be considered. Hence the minimal assumption is to consider E(X1) = E(X2) = μ and E(X1μ)2 = E(X2μ)2 = σ2. Furthermore, in this case we have also to take into account the possible presence of (micro) correlations, which appear at the level of macro-states as off-diagonal terms in the covariance matrix. In short, this implies considering the following probability distribution function

p ( x 1 , x 2 | μ , σ ) = exp [ 1 2 σ 2 ( 1 ρ 2 ) ( ( x 1 μ ) 2 2 ρ ( x 1 μ ) ( x 2 μ ) + ( x 2 μ ) 2 ) ] 2 π σ 2 1 ρ 2 ,

where ρ ∈ (−1,1).

Thanks to Equation (15) we compute the Fisher-Information matrix G and find g = g112 + g222 with,

g 11 = 2 σ 2 ( ρ + 1 ) ; g 22 = 4 σ 2 .

The only non trivial Christoffel coefficients (4) are Γ 12 1 = 1 σ, Γ 11 2 = 1 2 σ ( ρ + 1 ) and Γ 22 2 = 1 σ. In this case as well, the sectional curvature (Equation (7)) of the manifold is a negative function and so we may expect a decreasing asymptotic behavior for the IGC. From Equation (5) it follows that the geodesic equations are,

Entropy 16 02944f5

whose solutions are,

Entropy 16 02944f6

Using (27) in Equation (10) gives the volume,

v o l [ D Θ ( geodesic ) ( τ ) ] = 2 2 1 + ρ σ 2 d σ d μ = 4 A 1 | A 1 | exp [ σ 0 | A 1 | 2 ( 1 + ρ ) τ ] .

To have it positive we have to assume A1 > 0. Finally, employing (28) in (9) leads to the IGC,

vol ˜ [ D Θ ( geodesic ) ( τ ) ] ( 4 2 σ 0 A 1 ) 1 + ρ τ ,

with ρ ∈ (−1,1). We may compare the asymptotic expression of the ICGs in the presence and in the absence of correlations, obtaining

R bivariate strong ( ρ ) : = v o l ˜ [ D Θ ( geodesic ) ( τ ) ] v o l ˜ [ D Θ ( geodesic ) ( τ ) ] ρ = 0 = 1 + ρ ,

where “strong” stands for the fully connected lattice underlying the micro-variables. The ratio R bivariate strong ( ρ ) results a monotonic increasing function of ρ.

While the temporal behavior of the IGC (29) is similar to the IGC in (23), here correlations play a fundamental role. From Equation (30), we conclude that entropic inferences on two Gaussian distributed micro-variables on a fully connected lattice is carried out in a more efficient manner when the two micro-variables are negatively correlated. Instead, when such micro-variables are positively correlated, macroscopic predictions become more complex than in the absence of correlations.

Intuitively, this is due to the fact that for anticorrelated variables, an increase in one variable implies a decrease in the other one (different directional change): variables become more distant, thus more distinguishable in the Fisher–Rao information metric sense. Similarly, for positively correlated variables, an increase or decrease in one variable always predicts the same directional change for the second variable: variables do not become more distant, thus more distinguishable in the Fisher–Rao information metric sense. This may lead us to guess that in the presence of anticorrelations, motion on curved statistical manifolds via the Maximum Entropy updating methods becomes less complex.

3.3. Trivariate Gaussian Statistical Model

In this section we consider a Gaussian statistical model P of the Equation (12) when n = 3. In this case as well, in order to understand the asymptotic behavior of the IGC in the presence of correlations between the micro-states, we make the minimal assumption that, given the random vector X = (X1, X2, X3) distributed according to a trivariate Gaussian, then E(X1) = E(X2) = E(X3) = μ and E(X1μ)2 = E(X2μ)2 = E(X2μ)2 = σ2. Therefore, the space of the parameters of P is given by Θ = {(μ, σ)|μ ∈ ℝ, σ > 0}.

The manifold = ( Θ , g ) changes its metric structure depending on the number of correlations between micro-variables, namely, one, two, or three. The covariance matrices corresponding to these cases read, modulo the congruence via a permutation matrix [17],

Entropy 16 02944f7

3.3.1. Case 1

First, we consider the trivariate Gaussian statistical model of Equation (12) when CC1. Then proceeding like in Section 3.2 we have g = g112 + g222, where g 11 = 3 + ρ ( 1 + ρ ) σ 2 and g 22 = 6 σ 2. Also in this case we find that the sectional curvature of Equation (7) is a negative function. Hence, as we state in Section 2, we may expect a decreasing (in time) behavior of the information geometry complexity. Furthermore, we obtain the geodesics

σ ( τ ) = 2 σ 0 exp [ σ 0 A ( ρ ) τ ] 1 + exp [ 2 σ 0 A ( ρ ) τ ] , μ ( τ ) = 2 σ 1 A 1 A ( ρ ) 1 1 + exp [ 2 σ 0 A ( ρ ) τ ] ,

where A ( ρ ) = A 1 2 ( 3 + ρ ) 6 ( 1 + ρ ) and A1 ∈ ℝ. We remark that A ( ρ ) > 0 for all ρ ∈ (−1,1). Then, the volume (10) becomes

v o l [ D Θ ( geodesic ) ( τ ) ] = 6 ( 3 4 ρ ) ( 1 2 ρ 2 ) 1 σ 2 d σ d μ = 6 A 1 | A 1 | exp [ σ 0 A ( ρ ) τ ] ,

requiring A1 > 0 for its positivity. Finally, using (33) in (9) we arrive at the asymptotic behavior of the IGC

v o l ˜ [ D Θ ( geodesic ) ( τ ) ] ( 6 6 σ 0 A 1 ) 1 + ρ 3 + ρ 1 τ .

Comparing (34) in the presence and in the absence of correlations yields

R trivariate weak ( ρ ) : = v o l ˜ [ D Θ ( geodesic ) ( τ ) ] v o l ˜ [ D Θ ( geodesic ) ( τ ) ] ρ = 0 = 3 1 + ρ 3 + ρ ,

where “weak” stands for low degree of connection in the lattice underlying the micro-variables Notice that R trivariate weak ( ρ ) is a monotonic increasing function of the argument ρ ∈ (−1,1).

3.3.2. Case 2

When the trivariate Gaussian statistical model of Equation (12) has CC2, the condition C > 0 constraints the correlation coefficient to be ρ ( 2 2 , 2 2 ). Proceeding again like in Section 3.2 we have g = g112 + g22dσ2, where g 11 = 3 4 ρ ( 1 2 ρ 2 ) σ 2 and g 22 = 6 σ 2. The sectional curvature of Equation (7) is a negative function as well and so we may apply the arguments of Section 2 expecting a decreasing in time of the complexity. Furthermore, we obtain the geodesics

σ ( τ ) = 2 σ 0 exp [ σ 0 A ( ρ ) τ ] 1 + exp [ 2 σ 0 A ( ρ ) τ ] , μ ( τ ) = 2 σ 0 A 1 A ( ρ ) 1 1 + exp [ 2 σ 0 A ( ρ ) τ ] ,

where A ( ρ ) = A 1 2 ( 3 4 ρ ) 6 ( 1 2 ρ 2 ) and A1 ∈ ℝ. We remark that A ( ρ ) > 0 for all ρ ( 2 2 , 2 2 ). Then, the volume (10) becomes

v o l [ D Θ ( geodesic ) ( τ ) ] = 6 ( 3 4 ρ ) ( 1 2 ρ 2 ) 1 σ 2 d σ d μ = 6 A 1 | A 1 | exp [ σ 0 A ( ρ ) τ ] .

We have to set A1 > 0 for the positivity of the volume (37), and using it in (9) we arrive at the asymptotic behavior of the IGC

v o l ˜ [ D Θ ( geodesic ) ( τ ) ] ( 6 6 σ 0 A 1 ) 1 2 ρ 2 3 4 ρ 1 τ .

Then, comparing (38) in the presence and in the absence of correlations yields

R trivariate mildly weak ( ρ ) : = v o l ˜ [ D Θ ( geodesic ) ( τ ) ] v o l ˜ [ D Θ ( geodesic ) ( τ ) ] ρ = 0 = 3 1 2 ρ 2 3 4 ρ ,

where “mildly weak” stands for a lattice (underlying micro-variables) neither fully connected nor with minimal connection.

This is a function of the argument ρ ( 2 2 , 2 2 ) that attains the maximum 3 2 at ρ = 1 2, while in the extrema of the interval ( 2 2 , 2 2 ) it tends to zero.

3.3.3. Case 3

Last, we consider the trivariate Gaussian statistical model of the Equation (12) when CC3. In this case, the condition C > 0 requires the correlation coefficient to be ρ ( 1 2 , 1 ). Proceeding again like in Section 3.2 we have g = g112 + g222, where g 11 = 3 ( 1 + 2 ρ ) σ 2 and g 22 = 6 σ 2. We find that the sectional curvature of Equation (7) is a negative function; hence, we may expect a decreasing (in time) behavior of the complexity. It follows the geodesics

σ ( τ ) = 2 σ 0 exp [ σ 0 A ( ρ ) τ ] 1 + exp [ 2 σ 0 A ( ρ ) τ ] , μ ( τ ) = 2 σ 0 A 1 A ( ρ ) 1 1 + exp ( 2 σ 0 A ( ρ ) τ ) ,

where A ( ρ ) = A 1 2 2 ( 1 + 2 ρ ) and A1 ∈ ℝ. We note that A ( ρ ) > 0 for all ρ ( 1 2 , 1 ). Using (40), we compute

v o l [ D Θ ( geodesic ) ( τ ) ] = 3 2 ( 1 + 2 ρ ) 1 σ 2 d σ d μ = 6 2 A 1 | A 1 | exp [ σ 0 A ( ρ ) τ ] .

Also in this case we need to assume A1 > 0 to have positive volume. Finally, substituting Equation (41) into Equation (9), the asymptotic behavior of the IGC results

v o l ˜ [ D Θ ( geodesic ) ( τ ) ] ( 12 σ 0 A 1 ) 1 + 2 ρ 1 τ .

The comparison of (42) in the presence and in the absence of correlations yields

R trivariate strong ( ρ ) : = v o l ˜ [ D Θ ( geodesic ) ( τ ) ] v o l ˜ [ D Θ ( geodesic ) ( τ ) ] ρ = 0 = 1 + 2 ρ ,

where “strong” stands for a fully connected lattice underlying the (three) micro-variables. We remark the latter ratio is a monotonically increasing function of the argument ρ ( 1 2 , 1 ).

The behaviors of R(ρ) of Equations (30), (35), (39) and (43) are reported in Figure 1.

The non-monotonic behavior of the ratio R trivariate mildly weak ( ρ ) in Equation (39) corresponds to the information geometric complexities for the mildly weak connected three-dimensional lattice. Interestingly, the growth stops at a critical value ρ peak = 1 2 at which R trivariate mildly weak ( ρ peak ) = R bivariate strong weak ( ρ peak ). From Equation (30), we conclude that entropic inferences on three Gaussian distributed micro-variables on a fully connected lattice is carried out in a more efficient manner when the two micro-variables are negatively correlated. Instead, when such micro-variables are positively correlated, macroscopic predictions become more complex that in the absence of correlations. Furthermore, the ratio R trivariate strong ( ρ ) of the information geometric complexities for this fully connected three-dimensional lattice increases in a monotonic fashion. These conclusions are similar to those presented for the bivariate case. However, there is a key-feature of the IGC to emphasize when passing from the two-dimensional to the three-dimensional manifolds associated with fully connected lattices: the effects of negative-correlations and positive-correlations are amplified with respect to the respective absence of correlations scenarios,

R trivariate strong ( ρ ) R bivariate strong ( ρ ) = 1 + 2 ρ 1 + ρ ,

Where ρ ( 1 2 , 1 ).

Specifically, carrying out entropic inferences on the higher-dimensional manifold in the presence of anti-correlations, that is for ρ ( 1 2 , 0 ), is less complex than on the lower-dimensional manifold as evident form Equation (44). The vice-versa is true in the presence of positive-correlations, that is for ρ ∈ (0, 1).

4. Concluding Remarks

In summary, we considered low dimensional Gaussian statistical models (up to a trivariate model) and have investigated their dynamical (temporal) complexity. This has been quantified by the volume of geodesics for parameters characterizing the probability distribution functions. To the best of our knowledge, there is no dynamic measure of complexity of geodesic paths on curved statistical manifolds that could be compared to our IGC. However, it could be worthwhile to understand the connection, if any, between our IGC and the complexity of paths of dynamic systems introduced in [20]. Specifically, according to the Alekseev-Brudno theorem in the algorithmic theory of dynamical systems [21], a way to predict each new segment of chaotic trajectory is obtained by adding information proportional to the length of this segment and independent of the full previous length of trajectory. This means that this information cannot be extracted from observation of the previous motion, even an infinitely long one! If the instability is a power law, then the required information per unit time is inversely proportional to the full previous length of the trajectory and, asymptotically, the prediction becomes possible.

For the sake of completeness, we also point out that the relevance of volumes in quantifying the static model complexity of statistical models was already pointed out in [22] and [23]: complexity is related to the volume of a model in the space of distributions regarded as a Riemannian manifold of distributions with a natural metric defined by the Fisher–Rao metric tensor. Finally, we would like to point out that two of the Authors have recently associated Gaussian statistical models to networks [17]. Specifically, it is assumed that random variables are located on the vertices of the network while correlations between random variables are regarded as weighted edges of the network. Within this framework, a static network complexity measure has been proposed as the volume of the corresponding statistical manifold. We emphasize that such a static measure could be, in principle, applied to time-dependent networks by accommodating time-varying weights on the edges [24]. This requires the consideration of a time-sequence of different statistical manifolds. Thus, we could follow the time-evolution of a network complexity through the time evolution of the volumes of the associated manifolds.

In this work we uncover that in order to have a reduction in time of the complexity one has to consider both mean and variance as macro-variables. This leads to different topological structures of the parameter space in (13); in particular, we have to consider at least a 2-dimensional manifold in order to have effects such as a power law decay of the complexity. Hence, the minimal hypothesis in a multivariate Gaussian model consists in considering all mean values equal and all covariances equal. In such a case, however, the complexity shows interesting features depending on the correlation among micro-variables (as summarized in Figure 1). For a trivariate model with only two correlations the information geometric complexity ratio exhibits a non monotonic behavior in ρ (correlation parameter) taking zero value at the extrema of the range of ρ. In contrast to closed configurations (bivariate and trivariate models with all micro-variables correlated each other) the complexity ratio exhibits a monotonic behavior in terms of the correlation parameter. The fact that in such a case this ratio cannot be zero at the extrema of the range of ρ is reminiscent of the geometric frustration phenomena that occurs in the presence of loops [11].

Specifically, recall that a geometrically frustrated system cannot simultaneously minimize all interactions because of geometric constraints [11,18]. For example, geometric frustration can occur in an Ising model which is an array of spins (for instance, atoms that can take states ±1) that are magnetically coupled to each other. If one spin is, say, in the +1 state then it is energetically favorable for its immediate neighbors to be in the same state in the case of a ferromagnetic model. On the contrary, in antiferromagnetic systems, nearest neighbor spins want to align in opposite directions. This rule can be easily satisfied on a square. However, due to geometrical frustration, it is not possible to satisfy it on a triangle: for an antiferromagnetic triangular Ising model, any three neighboring spins are frustrated. Geometric frustration in triangular Ising models can be observed by considering spin configurations with total spin J = ±1 and analyzing the fluctuations in energy of the spin system as a function of temperature. There is no peak at all in the standard deviation of the energy in the case J = −1, and a monotonic behavior is recorded. This indicates that the antiferromagnetic system does not have a phase transition to a state with long-range order. Instead, in the case J = +1, a peak in the energy fluctuations emerges. This significant change in the behavior of energy fluctuations as a function of temperature in triangular configurations of spin systems is a signature of the presence of frustrated interactions in the system [19].

In this article, we observe a significant change in the behavior of the information geometric complexity ratios as a function of the correlation coefficient in the trivariate Gaussian statistical models. Specifically, in the fully connected trivariate case, no peak arises and a monotonic behavior in ρ of the information geometric complexity ratio is observed. In the mildly weak connected trivariate case, instead, a peak in the information geometric complexity ratio is recorded at ρpeak ≥ 0. This dramatic disparity of behavior can be ascribed to the fact that when carrying out statistical inferences with positively correlated Gaussian random variables, the maximum entropy favorable scenario is incompatible with these working hypothesis. Thus, the system appears frustrated.

These considerations lead us to conclude that we have uncovered a very interesting information geometric resemblance of the more standard geometric frustration effect in Ising spin models. However, for a conclusive claim of the existence of an information geometric analog of the frustration effect, we feel we have to further deepen our understanding. A forthcoming research project along these lines will be a detailed investigation of both arbitrary triangular and square configurations of correlated Gaussian random variables where we take into consideration both the presence of different intensities and signs of pairwise interactions (ρijρik if jk, ∀i).

Domenico Felice and Stefano Mancini acknowledge the financial support of the Future and Emerging Technologies (FET) programme within the Seventh Framework Programme for Research of the European Commission, under the FET-Open grant agreement TOPDRIM, number FP7-ICT-318121.

Author Contributions

The authors have equally contributed to the paper. All authors read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Feldman, D.F.; Crutchfield, J.P. Measures of Statistical Complexity: Why? Phys. Lett. A 1998, 238, 244–252. [Google Scholar]
  2. Kolmogorov, A.N. A new metric invariant of transitive dynamical systems and of automorphism of Lebesgue spaces. Doklady Akademii Nauk SSSR 1958, 119, 861–864. [Google Scholar]
  3. Caticha, A. Entropic Dynamics. In Bayesian Inference and Maximum Entropy Methods in Science and Engineering, the 22nd International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, Moscow, Idaho, 3–7 August 2002; Fry, R.L., Ed.; American Institute of Physics: College Park, MD, USA, 2002; 617, p. 302. [Google Scholar]
  4. Caticha, A.; Preuss, R. Maximum entropy and Bayesian data analysis: Entropic prior distributions. Phys. Rev. E 2004, 70, 046127. [Google Scholar]
  5. Amari, S.; Nagaoka, H. Methods of Information Geometry; Oxford University Press: New York, NY, USA, 2000. [Google Scholar]
  6. Cafaro, C. Works on an information geometrodynamical approach to chaos. Chaos Solitons Fractals 2009, 41, 886–891. [Google Scholar]
  7. Pettini, M. Geometry and Topology in Hamiltonian Dynamics and Statistical Mechanics; Springer-Verlag: Berlin/Heidelberg, Germany, 2007. [Google Scholar]
  8. Lebowitz, J.L. Microscopic Dynamics Macroscopic Laws. Ann. N. Y. Acad. Sci 1981, 373, 220–233. [Google Scholar]
  9. Shibata, T.; Chawanya, T.; Chawanya, K. Noiseless Collective Motion out of Noisy Chaos. Phys. Rev. Lett 1999, 82. doi: http://dx.doi.org/10.1103/PhysRevLett.82.4424. [Google Scholar]
  10. Ali, S.A.; Cafaro, C.; Kim, D.-H.; Mancini, S. The effect of the microscopic correlations on the information geometric complexity of Gaussian statistical models. Physica A 2010, 389, 3117–3127. [Google Scholar]
  11. Sadoc, J.F.; Mosseri, R. Geometrical Frustration; Cambridge University Press: Cambridge, UK, 2006. [Google Scholar]
  12. Lee, J.M. Riemannian Manifolds: An Introduction to Curvature; Springer: New York, NY, USA, 1997. [Google Scholar]
  13. Do Carmo, M.P. Riemannian Geometry; Springer: New York, NY, USA, 1992. [Google Scholar]
  14. Cafaro, C.; Ali, S.A. Jacobi fields on statistical manifolds of negative curvature. Physica D 2007, 234, 70–80. [Google Scholar]
  15. Cafaro, C.; Giffin, A.; Ali, S.A.; Kim, D.-H. Reexamination of an information geometric construction of entropic indicators of complexity. Appl. Math. Comput 2010, 217, 2944–2951. [Google Scholar]
  16. Cafaro, C.; Mancini, S. Quantifying the complexity of geodesic paths on curved statistical manifolds through information geometric entropies and Jacobi fields. Physica D 2011, 240, 607–618. [Google Scholar]
  17. Felice, D.; Mancini, S.; Pettini, M. Quantifying Networks Complexity from Information Geometry Viewpoint. J. Math. Phys 2014, 55, 043505. [Google Scholar]
  18. Moessner, R.; Ramirez, A.P. Geometrical Frustration. Phys. Today 2006, 59, 24–29. [Google Scholar]
  19. MacKay, D.J.C. Information Theory, Inference, and Learning Algorithms; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
  20. Brudno, A.A. The complexity of the trajectories of a dynamical system. Uspekhi Mat. Nauk 1978, 33, 207–208. [Google Scholar]
  21. Alekseev, V.M.; Yacobson, M.V. Symbolic dynamics and hyperbolic dynamic systems. Phys. Rep 1981, 75, 290–325. [Google Scholar]
  22. Myung, J.; Balasubramanian, V.; Pitt, M.A. Counting probability distributions: differential geometry and model selection. Proc. Natl. Acad. Sci. USA 2000, 97, 11170. [Google Scholar]
  23. Rodriguez, C.C. The volume of bitnets. AIP Conf. Proc 2004, 735, 555–564. [Google Scholar]
  24. Motter, A.E.; Albert, R. Networks in motion. Phys. Today 2012, 65, 43–48. [Google Scholar]
Entropy 16 02944f1 1024
Figure 1. Ratio R(ρ) of volumes vs. degree of correlations ρ. Solid line refers to R bivariate strong ( ρ ); Dotted line refers to R trivariate weak ( ρ ); Dashed line referes to R trivariate mildly weak ( ρ ); Dash-dotted refers to R trivariate strong ( ρ ).

Click here to enlarge figure

Figure 1. Ratio R(ρ) of volumes vs. degree of correlations ρ. Solid line refers to R bivariate strong ( ρ ); Dotted line refers to R trivariate weak ( ρ ); Dashed line referes to R trivariate mildly weak ( ρ ); Dash-dotted refers to R trivariate strong ( ρ ).
Entropy 16 02944f1 1024
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert