This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Markov random field models are powerful tools for the study of complex systems. However, little is known about how the interactions between the elements of such systems are encoded, especially from an information-theoretic perspective. In this paper, our goal is to enlighten the connection between Fisher information, Shannon entropy, information geometry and the behavior of complex systems modeled by isotropic pairwise Gaussian Markov random fields. We propose analytical expressions to compute local and global versions of these measures using Besag’s pseudo-likelihood function, characterizing the system’s behavior through its Fisher curve, a parametric trajectory across the information space that provides a geometric representation for the study of complex systems in which temperature deviates from infinity. Computational experiments show how the proposed tools can be useful in extracting relevant information from complex patterns. The obtained results quantify and support our main conclusion, which is: in terms of information, moving towards higher entropy states (A –> B) is different from moving towards lower entropy states (B –> A), since the Fisher curves are not the same, given a natural orientation (the direction of time).

With the increasing value of information in modern society and the massive volume of digital data that is available, there is an urgent need for developing novel methodologies for data filtering and analysis in complex systems. In this scenario, the notion of what is informative or not is a top priority. Sometimes, patterns that at first may appear to be locally irrelevant may turn out to be extremely informative in a more global perspective. In complex systems, this is a direct consequence of the intricate non-linear relationship between the pieces of data along different locations and scales.

Within this context, information theoretic measures play a fundamental role in a huge variety of applications once they represent statistical knowledge in a systematic, elegant and formal framework. Since the first works of Shannon [

In general, classical statistical inference is focused on capturing information about location and dispersion of unknown parameters of a given family of distribution and studying how this information is related to uncertainty in estimation procedures. In typical situations, an exponential family of distributions and independence hypothesis (independent random variables) are often assumed, giving the likelihood function a series of desirable mathematical properties [

Although mathematically convenient for many problems, in complex systems modeling, independence assumption is not reasonable, because much of the information is somehow encoded in the relations between the random variables [

In this paper, we assume an isotropic pairwise Gaussian Markov random field (GMRF) model [

In searching for answers for our fundamental question, investigations led us to an exact expression for the asymptotic variance of the maximum pseudo-likelihood (MPL) estimator of

In summary, our idea is to describe the behavior of a complex system in terms of information as its temperature deviates from infinity (when the particles are statistically independent) to a lower bound. The obtained results suggest that, in the beginning, when the temperature is infinite and the information equilibrium prevails, the information is somehow spread along the system. However, when temperature is low and this equilibrium condition does not hold anymore, we have a more sparse representation in terms of information, since this information is concentrated in the boundaries of the regions that define a smooth global configuration. In the vast remaining of this “universe”, due to this smooth constraint, the strong alignment between the particles prevails, which is exactly the expected global behavior for temperatures below a critical value (making the majority of the interaction patterns along the system uninformative).

The remainder of the paper is organized as follows: Section 2 discusses a technique for the estimation of the inverse temperature parameter, called the maximum pseudo-likelihood (MPL) approach, and provides derivations for the observed Fisher information in an isotropic pairwise GMRF model. Intuitive interpretations for the two versions of this local measure are discussed. In Section 3, we derive analytical expressions for the computation of the expected Fisher information, which allows us to assign a global information measure for a given system configuration. Similarly, in Section 4, an expression for the global entropy of a system modeled by a GMRF is shown. The results suggest a connection between maximum pseudo-likelihood and minimum entropy criteria in the estimation of the inverse temperature parameter on GMRFs. Section 5 discusses the uncertainty in the estimation of this important parameter by defining an expression for the asymptotic variance of its maximum pseudo-likelihood estimator in terms of both forms of Fisher information. In Section 6, the definition of the Fisher curve of a system as a parametric trajectory in the information space is proposed. Section 7 shows the experimental setup. Computational simulations with both Markov chain Monte Carlo algorithms and some real data were conducted, showing how the proposed tools can be used to extract relevant information from complex systems. Finally, Section 8 presents our conclusions, final remarks and possibilities for future works.

The remarkable Hammersley–Clifford theorem [

_{i}_{1}, _{2}, . . . , _{n}_{i}_{i}^{2},^{2} are the expected value and the variance of the random variables, and _{i}

Maximum likelihood estimation is intractable in MRF parameter estimation, due to the existence of the partition function in the joint Gibbs distribution. An alternative, proposed by Besag [

_{1}, _{2}, . . . , _{n}_{i}

Note that the pseudo-likelihood function is a function of the parameters. For better mathematical tractability, it is usual to take the logarithm of ^{(}^{t}^{)}). Plugging

By differentiating ^{2} and _{i}^{2} become the widely known sample mean and sample variance.

Since the cardinality of the neighborhood system, _{i}_{MPL}_{ij}_{i}_{j}_{i}_{jk}_{i}_{i}_{i}

Basically, Fisher information measures the amount of information a sample conveys about an unknown parameter. It can be thought of as the likelihood analog of entropy, which is a probability-based measure of uncertainty. Often, when we are dealing with independent and identically distributed (i.i.d) random variables, the computation of the global Fisher information presented in a random sample
_{i}_{i}_{i}

It is widely known from statistical inference theory that, under certain regularity conditions, information equality holds in the case of independent observations in the exponential family [^{(}^{t}^{)}) denotes the likelihood function at a time instant,

However, given the intrinsic spatial dependence structure of Gaussian Markov random field models, information equilibrium is not a natural condition. As we will discuss later, in general, information equality fails. Thus, in a GMRF model, we have to consider two kinds of Fisher information, from now on denoted by Type I (due to the first derivative of the pseudo-likelihood function) and Type II (due to the second derivative of the pseudo-likelihood function). Eventually, when certain conditions are satisfied, these two values of information will converge to a unique bound. Essentially,

In terms of information geometry, it has been shown that the geometric structure of the exponential family of distributions is basically given by the Fisher information matrix, which is the natural Riemmanian metric (metric tensor) [^{2}). However, as the inverse temperature parameter starts to increase, the original surface is gradually transformed to a 3D Riemmanian manifold, equipped with a novel metric tensor (the 3 × 3 Fisher information matrix for ^{2} and _{MIN}_{MAX}

In order to quantify the amount of information conveyed by a local configuration pattern in a complex system, the concept of observed Fisher information must be defined.

_{1}, _{2}, . . . , _{n}_{i}_{i}

Hence, for an isotropic pairwise GMRF model, the Type I local observed Fisher information regarding _{i}

_{1}, _{2}, . . . , _{n}_{i}_{i}

In case of an isotropic pairwise GMRF model, the Type II local observed Fisher information regarding _{i}_{β}_{i}_{i}_{i}

Therefore, we have two local measures, _{β}_{i}_{β}_{i}

At this point, a relevant issue is the interpretation of these Fisher information measures in a complex system modeled by an isotropic pairwise GMRF. Roughly speaking, _{β}_{i}_{i}_{β}_{i}

Now, let us move on to configuration patterns showing high values of _{β}_{i}_{β}_{i}_{i}

As we will see later in the experiments section, typical informative patterns (those showing high values of _{β}_{i}_{i}

Let us analyze the Type II local observed Fisher information, _{β}_{i}_{i}_{β}_{i}_{β}_{i}

On the other hand, if the global configuration is mostly composed of patterns exhibiting large values of _{β}_{i}_{β}_{i}_{i}_{i}_{β}_{i}_{β}_{i}_{β}_{i}_{i}_{β}_{i}_{β}_{i}_{β}_{i}

It is important to mention that these rather informal arguments define the basis for understanding the meaning of the asymptotic variance of maximum pseudo-likelihood estimators, as we will discuss ahead. In summary, _{β}_{i}_{i}

In order to avoid the use of approximations in the computation of the global Fisher information in an isotropic pairwise GMRF, in this section, we provide an exact expression for _{β}_{β}_{i}_{β}_{β}

Recall that the Type I expected Fisher information, from now on denoted by Φ_{β}

The Type II expected Fisher information, from now on denoted by Ψ_{β}

We first proceed to the definition of Φ_{β}

Hence, the expression for Φ_{β}

Then, the first term of _{sr}_{s}_{r}_{sr}_{r}_{s}

The third term of _{β}_{β}_{β}^{(}^{t}^{)} (a photograph of the system at a given moment), we consider _{s}_{r}

Before proceeding, we would like to clarify some points regarding the estimation of the ^{(}^{t}^{)}, for a global configuration of the system, ^{(}^{t}^{)} (this is our assumption); or (2) the parameter is spatially-variant, which means that we have a set of _{s}

Therefore, in our case (_{β}_{β}_{i}_{j}_{ij}_{i}_{j}_{k}_{i}_{β}

Following the same methodology of replacing the likelihood function by the pseudo-likelihood function of the GMRF model, a closed form expression for Ψ_{β}_{β}_{β}_{β}

In order to simplify the notations and also to make computations easier, the expressions for Φ_{β}_{β}_{p}_{i}_{i}_{i}_{i}_{p}_{p}_{i}_{j}_{p}_{β}

_{1}, _{2}, . . . , _{n}_{i}_{β}^{(t)}, _{+} denotes the summation of all the entries of the matrix, _{i}_{i}

_{1}, _{2}, . . . , _{n}_{i}_{β}^{(t)},

From the definition of both Φ_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β}^{*}_{β}^{*} = 0. In other words, when _{ij}_{i}_{i}_{j}^{*} = 0, which in this case, is the maximum pseudo-likelihood estimate of β, since in this matrix-vector notation, _{MPL}_{+} = 0, and as a consequence, Φ_{β}_{β}_{β}_{β}_{β}

Our definition of entropy is done by repeating the same process employed to derive Φ_{β}_{β}

_{1}, _{2}, . . . , _{n}_{i}_{β}^{(t)}

After some algebra, the expression for _{β}_{β}

_{1}, _{2}, . . . , _{n}_{i}_{β}^{(t)}, _{G}^{2} and Ψ_{β}

Note that Shannon entropy is a quadratic function of the spatial dependence parameter, _{β}_{β}_{G}_{MH}_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β}

It is known from the statistical inference literature that unbiasedness is a property that is not granted by maximum likelihood estimation, nor by maximum pseudo-likelihood (MPL) estimation. Actually, there is no universal method that guarantees the existence of unbiased estimators for a fixed _{β}_{β}

In mathematical statistics, asymptotic evaluations uncover several fundamental properties of inference methods, providing a powerful and general tool for studying and characterizing the behavior of estimators. In this section, our objective is to derive an expression for the asymptotic variance of the maximum pseudo-likelihood estimator of the inverse temperature parameter (_{MPL}_{MPL}_{β}_{β}_{β}^{(}^{t}^{)}) with relation to _{β}

_{1}, _{2}, . . . , _{n}_{i}

Note that when information equilibrium prevails, that is Φ _{β}_{β}_{β}

With the definition of Φ _{β}_{β}_{β}

_{1}, _{2}, . . . , _{n}_{i}^{(β1)}, ^{(β2)},…, ^{(βn)} _{i}_{MIN}_{1} _{2} _{n}_{MAX}^{3} ^{(}^{βi}^{)}, _{βi}_{βi}_{βi}_{β}_{β}_{β}^{(β)}, defined by:

In the next sections, we show some computational experiments that illustrate the effectiveness of the proposed tools in measuring the information encoded in complex systems. We want to investigate what happens to the Fisher curve as the inverse temperature parameter is modified in order to control the system’s global behavior. Our main conclusion, which is supported by experimental analysis, is that

This section discusses some numerical experiments proposed to illustrate some applications of the derived tools in both simulations and real data. Our computational investigations were divided into two main sets of experiments:

Local analysis: analysis of the local and global versions of the measures (_{β}_{β}_{β}_{β}_{β}

Global analysis: analysis of the global versions of the measures (Φ _{β}_{β}_{β}

First, in order to illustrate a simple application of both forms of local observed Fisher information, _{β}_{β}^{2} = 5 and

Three Fisher information maps were generated from both initial and resulting configurations. The first map was obtained by calculating the value, _{β}_{i}_{β}_{i}_{β}_{i}_{β}_{i}_{i}_{i}_{β}_{i}_{β}_{i}_{β}_{i}_{β}

In order to study the behavior of a complex system that evolves from an initial State A to another State B, we use the Metropolis–Hastings algorithm, an MCMC simulation method, to generate a sequence of valid isotropic pairwise GMRF model outcomes for different values of the inverse temperature parameter, _{β}_{β}_{β}

To simulate a system in which we can control the inverse temperature parameter, we define an updating rule for _{MIN}_{MIN}_{MAX}_{M I N}_{M A X}_{β}_{β}_{β}_{M I N}_{M A X}_{M I N}_{M A X}_{M A X}^{2} = 5 and _{i}

A plot of both forms of the expected Fisher information, Φ _{β}_{β}_{β}_{β}_{β}_{β}_{β}

The results also show that the difference between Φ _{β}_{β}_{β}_{β}

_{β}_{β}

We now proceed to the analysis of the Shannon entropy of the system along the simulation. Despite showing a behavior similar to Ψ_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β} and Ψ_{β}_{β}

Another interesting global information-theoretic measure is L-information, from now on denoted by _{β}^{(}^{t}^{)}) are not null). _{β}_{β}_{β}_{β}

To investigate the intrinsic non-linear connection between Φ _{β}_{β}_{β}_{β}_{β}_{min}_{max}_{β}_{β}_{β}

We can see that the majority of points along the Fisher curve is concentrated around two regions of high curvature: (A) around the information equilibrium condition (an absence of short-term and long-term correlations, since _{c}

By now, some observations can be highlighted. First, the natural orientation of the Fisher curve defines the direction of time. The natural A–B path (increase in entropy) is given by the blue curve and the natural B–A path (decrease in entropy) is given by the red curve. In other words, the only possible way to walk from A to B (increase _{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β} and _{β}_{β}_{β}_{β}_{β}_{β}_{β}

In summary, the central idea discussed here is that while entropy provides a measure of order/disorder of the system at a given configuration, ^{(}^{t}^{)}, Fisher information links these thermodynamical states through a path (Fisher curve). Thus, Fisher information is a powerful mathematical tool in the study of complex and dynamical systems, since it establishes how these different thermodynamical states are related along the evolution of the inverse temperature. Instead of knowing whether the entropy, _{β}

To test whether a system can recover part of its original configuration after a perturbation is induced, we conducted another computational experiment. During a stable simulation, two kinds of perturbations were induced in the system: (1) the value of the inverse temperature parameter was set to zero for the next consecutive two iterations; (2) the value of the inverse temperature parameter was set to the equilibrium value, ^{*}

When the system is disturbed by setting _{β}_{β}^{*}

The goal of this section is to summarize the main results obtained in this paper, focusing on the interpretation of the Fisher curve of a system modeled by a GMRF. First, our system is initialized with a random configuration, simulating that in the moment of its creation, the temperature is infinite (_{β}_{β}

By reducing the global temperature (_{β}_{β}

Hence, after the break in the information equilibrium condition, there is a significant increase in the entropy as the system continues to evolve. This stage lasts while the temperature of the system is further reduced or kept established. When the temperature starts to increase (

This fundamental symmetry break becomes evident when we look at the Fisher curve of the system. We clearly see that the path from the state of minimum entropy, A, and the state of maximum entropy, B, defined by the curve,

Therefore, if that first fundamental symmetry break did not exist, or even if it had happened, but all the posterior evolution of Φ _{β}_{β}_{β}

The definition of what is information in a complex system is a fundamental concept in the study of many problems. In this paper, we discussed the roles of two important statistical measures in isotropic pairwise Markov random fields composed of Gaussian variables: Shannon entropy and Fisher information. By using the pseudo-likelihood function of the GMRF model, we derived analytical expressions for these measures. The definition of a Fisher curve as a geometric representation for the study and analysis of complex systems allowed us to reveal the intrinsic non-linear relation between these information-theoretic measures and gain insights about the behavior of such systems. Computational experiments demonstrate the effectiveness of the proposed tools in decoding information from the underlying spatial dependence structure of a Gaussian-Markov random field. Typical informative patterns in a complex systems are located in the boundaries of the clusters. One of the main conclusions of this scientific investigation concerns the notion of time in a complex system. The obtained results suggest that the relationship between Fisher information and entropy determines whether the system is moving forward or backward in time. Apparently, in the natural orientation (when the system is evolving forward in time), when ^{n}

The author would like to thank CNPQ(Brazilian Council for Research and Development) for the financial support through research grant number 475054/2011-3.

The authors declare no conflict of interest.

Example of Gaussian Markov random field (GMRF) model outputs. The values of the inverse temperature parameter,

Fisher information maps. The first row shows the information maps of the system when the temperature is infinite (_{β}_{i}_{β}_{i}

Distribution of local L-information. When the temperature is infinite, the information is spread along the system. For low temperature configurations, the number of local patterns with zero information content significantly increases, that is the system is more sparse in terms of Fisher information.

Global configurations along a Markov chain Monte Carlo (MCMC) simulation. Evolution of the system as the inverse temperature parameter,

Evolution of Fisher information along an MCMC simulation. As the difference between Φ _{β}_{β}^{*}), the uncertainty about the real inverse temperature parameter is minimized and the number of informative patterns increases. In the information equilibrium condition (^{**}), it is hard to find informative patterns, since there is no induced spatial dependence structure.

Real and estimated inverse temperatures along the MCMC simulation. The system’s global behavior is controlled by the real inverse temperature parameter values (blue line), used to generate the GMRF outputs. The maximum pseudo-likelihood estimative is used to compute both Φ _{β}_{β}

Evolution of Shannon entropy along an MCMC simulation. _{β}

Evolution of L-information along an MCMC simulation. When _{β}

2D and 3D Fisher curves of a complex system along an MCMC simulation. The graph shows a parametric curve obtained by varying the _{M I N}_{M A X}_{β}_{β}

2D and 3D Fisher curves along another MCMC simulation. The graph shows a parametric curve obtained by varying the _{M I N}_{M A X}

Relations between entropy and Fisher information. When a system modeled by an isotropic pairwise GMRF evolves in the natural orientation (forward in time), two rules that relate Fisher information and entropy can be observed: (1) an increase in Ψ_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β}_{β}

Disturbing the system to induce changes. Variation on Φ _{β}_{β}_{β}_{β}^{*}

The sequence of outputs along the MCMC simulation before and after the system is disturbed. The first row (when ^{*}) indicates that the system was able to recover a significant part from its previous stable configuration.