Information Landscape and Flux, Mutual Information Rate Decomposition and Entropy Production

We explore the dynamics of information systems. We show that the driving force for information dynamics is determined by both the information landscape and information flux which determines the equilibrium time reversible and the nonequilibrium time-irreversible behaviours of the system respectively. We further demonstrate that the mutual information rate between the two subsystems can be decomposed into the time-reversible and time-irreversible parts respectively, analogous to the information landscape-flux decomposition for dynamics. Finally, we uncover the intimate relation between the nonequilibrium thermodynamics in terms of the entropy production rates and the time-irreversible part of the mutual information rate. We demonstrate the above features by the dynamics of a bivariate Markov chain.


Introduction
There are growing interests in studying the information systems in the fields of control theory, information theory, communication theory, and biophysics [1,2,3,4,5,6]. Significant progresses have been made recently towards the understanding of the information system in terms of information thermodynamics [10,11,12,13]. However, the identification of the global driving force for the information system dynamics is still challenging. Here we would like to fill the gap by quantifying the driving forces for the information system dynamics. Inspired by the recent development of landscape and flux theory for the nonequilibrium systems [14,15,16], we will show that the driving force for information dynamics is determined by both the information landscape and information flux. The information flux is a measure of the degree of nonequilibirumness or time irreversibility. Mutual information represents the correlation between two information subsystems. We uncovered that the mutual information rate between the two subsystems can be decomposed into the time-reversible and time-irreversible parts respectively. This is originated from the information landscape-flux decomposition for dynamics. An important signature of nonequilibriumness is the entropy production or energy cost. We also uncover the intimate relation between the entropy production rates and the time-irreversible part of the mutual information rate. We demonstrate the above features by the dynamics of a bivariate Markov chain.

Bivariate Markov Chains
Markov chains have been often assumed for the underlying information dynamics of the total system in random environments. That is, the two subsystems together forms a Markov chain in continuous or discrete times, which is the so-called Bivariate Markov Chain(BMC). The processes of the two subsystems are correspondingly said to be marginal processes or marginal chain. The BMC was used to model ion channel currents [2], it was also used to model delays and congestion in a computer network [3]. Recently, different models of BMC appeared in non-equilibrium statistical physics for capturing or implementing the Maxwell's demon [4,5,6], which can be seen as one marginal chain in the BMC playing feedback control to the other marginal chain. Although the BMC has been studied for decades, there are still challenges on quantifying the dynamics of the whole as well as the two subsystems. This is because neither of them needs to be Markovian chain in general [7], and the quantifications of the probabilities (densities) for the trajectories of the two subsystems involve complex random matrices manipulations [8]. This leads to the problem not exactly analytically solvable. The corresponding numerical solutions often lack direct mathematical and physical interpretations.
The conventional analysis of the BMC focuses on the mutual information [9] of the two subsystems for quantifying the underlying information correlations. There are three main representations on this. The first one was proposed by Sagawa [10,11] for explaining the mechanism of Maxwell's demon in Szilard's engine.
In this representation, the mutual information between the demon and controlled system characterizes the observation and the feedback of the demon. This leads to an elegant way which includes the increment of the mutual information into a unified fluctuation relation. The second representation was proposed by Esposito [12] in an attempt to explain the violation of the second law in a specified BMC, the bipartite model, where the mutual information is divided into two parts corresponding to the two subsystems respectively, which were said to be the information flows. This representation tries to explain the mechanism of the demon because one can see that the information flows do contribute to the entropy production to both demon and controlled system. The first two representations are based on the ensembles of the subsystem states. This means that the mutual information is defined only on the time-sliced distributions of the system states, which somehow lacks the information of subsystem dynamics: the time-correlations of the observation and feedback of the demon. The last representation was seen in the work of Seifert [13] where he used a more general definition of mutual information in information theory, which is defined on the trajectories of the two subsystem. More exactly, this is the so-called Mutual Information Rate (MIR) which quantifies the correlation between the two subsystem dynamics. However, due to the difficulties from the possible underlying non-Markovian property of the marginal chains, exactly solvable models and comprehensive conclusions are still challenging from this representation.
In this study, we study the discrete-time BMC in both stochastic dynamics. To avoid the technical difficulty caused by non-Markovian dynamics, we first assume that the two marginal chains follow the Markovian dynamics. We explore the time-irreversibility of BMC and marginal processes in steady state.
Then we decompose driving force for the underlying information dynamics as the information landscape and information flux [14,15,16] representing the time-reversible parts and time-irreversible parts respectively.
We also prove that the non-vanishing flux fully describes the time-irreversibility of BMC and marginal processes.
We focus on the mutual information rate between the two marginal chains in information dynamics.
Since the two marginal chains are assumed to be Markov chains here, the mutual information rate is exactly analytically solvable, which can be seen as the averaged conditional correlation between the two subsystem states. Here the conditional correlations reveal the time correlations between the past states and the future states.
Corresponding to the landscape-flux decomposition in stochastic dynamics, we decompose the MIR into two parts: the time-reversible and time-irreversible parts respectively. The time-reversible part measures the part of the correlations between the two marginal chains in both forward and backward processes of BMC. The time-irreversible part measures the difference between the correlations in forward and backward processes of BMC respectively. We can see that a non-vanishing time-irreversible part of the MIR must be driven by a non-vanishing flux in steady state, and can be seen as the sufficient condition for a BMC to be time-irreversible.
We also reveal the important fact that the time-irreversible parts of MIR contributes to the nonequilibrium Entropy Production Rate (EPR) of the BMC by the simple equality: EPR of BMC = EPR of 1st marginal chain + EPR of 2nd marginal chain + 2 × time-irreversible part of MIR.
And this relation may help to develop general theory on nonequilibrium interacting information system dynamics.

Information Landscape and Information Flux for Determining the Information Dynamics, Time-Irreversibility
Consider a finite-state, discrete-time, ergodic, and irreducible bivariate Markov chain We assume that the state space of X is given by X = {1, ..., d} and the state space of S is given by The state space of Z is then given by Z = X × S. The time evolution of distribution of Z is characterized by the following master equation in discrete time, where p z (z; t) = p z (x, s; t) is the probability of observing state z (or joint probability of X = x and S = s) respectively and are with z q z (z|z ′ ) = 1.
We assume that there exists a unique stationary distribution π z such that π z (z) = z ′ q z (z|z ′ )π z (z ′ ).
Then given arbitrary initial distribution, the distribution goes to π z exponentially fast in time. If the initial distribution is π z , we say that Z is in Steady State (SS) and our discussion is based on this SS.
The marginal chains of Z, i.e., X and S, do not need to be Markov chains in general. For simplicity of analysis, we assume that both marginal chains are Markov chains and the corresponding transition probabilities are given by q x (x|x ′ ) and q s (s|s ′ ) (for x, x ′ ∈ X and s, s ′ ∈ S) respectively. Then we have the following master equations for X and S, and where p x (x; t) and p s (s; t) are the probabilities of observing X = x and S = s at time t respectively.
We consider that both Eqs. (3,4) have unique stationary solutions π x and π s which satisfy π x (x) = respectively. Also, we assume that when Z is in SS, π x and π s are also achieved. The relations between π x , π s and π z read, In the rest of this paper, we let X T = {X (1) This is also the probability of the time sequence C T = {C(1) = c ′ , C(2) = c}, (T = 2). Correspondingly, the averaged number of reverse transitions, denoted by N (c → c ′ ), reads This is also the the probability of the time-reverse sequence The difference between these two transition numbers measures the time-reversibility of the forward sequence Then, J c (c ′ → c) is said to be the probability flux from c ′ to c in SS. If J c (c ′ → c) = 0 for arbitrary c ′ and The transition probability determines the evolution dynamics of the information system. We can decompose the transition probabilities q c (c|c ′ ) into two parts: the time-reversible part D c and time-irreversible part From this decomposition, we can see that the information dynamics is determined by two driving forces.
One of the driving force is determined by the steady state probability distribution and is time reversible. The other driving force for the information system dynamics is the steady state probability flux which breaks the detailed balance and quantify the time irreversibility. Since the steady state probability measures the weight of the information state, therefore it quantifies the information landscape. If we define the potential landscape for the information system as φ = − log π, then the D c (c ′ → c) ]q c (c ′ |c)) becomes the difference or "gradient" in the potential landscape. Therefore, this reversible part of the information dynamics is determined by the "gradient" of the information landscape. The steady state probability flux measures the information flow in the dynamics and therefore can be termed as the information flux. It is a direct measure of the nonequilibriumness in terms of time irreversibility.
By Eqs. (7,8), we have the following relations As we can see in next section, D c and B c are useful for us to quantify time-reversible and time-irreversible observables of C respectively. We for C = X, S or Z.
Then by the relations given in Eq.(9), we have P (C T ) − P ( C T ) = 0 holds for arbitrary C T if and only if This conclusion can be made for arbitrary T > 3. Thus, non-vanishing J c can fully describe the timeirreversibility of C for C = X, S, or Z.
We show the relations between the fluxes of the whole system J z and of the subsystem J x as following: Similarly, we have These relations indicate that the subsystem fluxes J x and J s can be seen as the coarse-grained levels of total system flux J z by averaging over the other part of the system S and X respectively. We should emphasize that, Non-vanishing J z does not mean X or S is time-irreversible and vice versa.

Mutual Information Decomposition to Time-Reversible and Time-Irreversible Parts
According to the information theory, the two interacting information systems represented by bivariate Markov chain Z can be characterized by the Mutual Information Rate (MIR) between the marginal chains X and S in SS. The mutual information rates represents correlation between two interacting infomration systems. The MIR is defined on the probabilities of all possible time sequences, P (Z T ), P (X T ), and P (S T ), and is given by It measures the correlation between X and S in unit time, or say, the efficient bits of information that X and S exchange with each other in unit time. The MIR must be non-negative, and a vanishing I(X, S) indicates that X and S are independent of each other. More explicitly, the corresponding probabilities of these sequences can be evaluated by using Eqs.(2,3,4), we have      P (X T ) = π x (X(1)) T −1 t=1 q x (X(t + 1)|X(t)), P (S T ) = π s (S(1)) T −1 t=1 q s (S(t + 1)|S(t)), P (Z T ) = π z (Z(1)) T −1 t=1 q z (Z(t + 1)|Z(t)).
By substituting these probabilities into Eq.(12) (see Appendix), we have the exact expression of MIR as where i(z|z ′ ) = log  Then I B (X, S) measures the change of averaged conditional correlation between X and S when a sequence of Z turns back in time,

Relationship Between Mutual Information and Entropy Production
The Entropy Production Rates (EPR) at steady state is a quantitative nonequilibriumness measure which characterizes the time-irreversibility of the underlying processes. The EPRs of the information system described by the bivariate Markov chains here can be given by where total and subsystem entropy productions R z , R x , and R s correspond to Z, X, and S respectively.
Here, R z usually contains the detailed interaction information of the system (or subsystems) and environments; R x and R s provide the coarse-grained information of time-irreversible observables of X and Z respectively. Each non-vanishing EPR indicates that the corresponding Markov chain is time-irreversible.
Again, we emphasize that a non-vanishing R z does not mean X or S is time-irreversible and vice versa.
We are interested in the connection between these EPRs and mutual information. We can associate them with I B (X, S) by noting Eqs. (10,11,14). We have We note that I B (X, S) intimated related to the EPRs. This builds up a bridge between these EPRs and irreversible part of the mutual information. Moreover, we also have This indicates that the time-irreversible MIR contributes to the detailed EPR. In other words, The differences of entropy production rate of the whole system and subsystems provides the origin of the time irreversible part of the mutual information. This gives the nonequilibrium thermodynamic origin of the irreversible mutual information or correlations. Of course, since the EPR is related to the flux directly as seem from above definitions, the origin of the EPR or nonequilibrium thermodynamics is from the non-vanishing information flux for the nonequilibrium dynamics. On the other hand, irreversible part of the mutual information measures the correlations and contributes to the correlated part of the EPR between the subsystems. The transition probabilities of the system read

A Simple Case: Blind Demon
The transition probabilities of the demon read q s (s|s ′ ) = P (s).
And the transition probabilities of the joint chain read We have the corresponding steady state distributions or the information landscape as, , π s (s) = P (s), π z (x, s) = P (s)π x (x).
We obtain the information fluxes as, We obtain the EPRs as We evaluate the MIR as The time-irreversible part of I(X, S) reads,

Conclusion
In this work, we identify the driving forces for the information system dynamics. We show that the information system dynamics is determined by both the information landscape and information flux representing the time reversible and time irreversible part of the information dynamics. We further demonstrate that the mutual information representing the correlations can be decomposed into time reversible part and time irreversible part originated from the landscape and flux decomposition of the information dynamics. Finally we uncover the intimate relationship between the difference of the entropy production of the whole system and the subsystems and the time irreversible part of the mutual information. This will help for understanding the non-equilibrium behaviour of the interacting information system dynamics in random environments.
Furthermore, we believe that our conclusion can be made more general for the BMC with non-Markovian marginal chains which we will discuss in a separate work.

Appendix
Here, we derive the exact form of Mutual Information Rate (MIR, Eq.(13)) in steady state by using the cumulant-generating function.
We write arbitrary time sequence of Z in time T in the form as following where Z(i) (for i ≥ 1) denotes the state at time i. The corresponding probability of Z T is in the following form P (Z T ) = π z (Z 1 ) We let the chain U = (X, S) to denote a process that X and S follow the same Markov dynamics in Z but are independent of each other. Then we have the transition probabilities of U read q u (u|u ′ ) = q(x, s|x ′ , s ′ ) = q x (x|x ′ )q s (s|s ′ ).

(A.2)
Then the probability of a time sequence of U , U T , with the same trajectory of Z T reads P (U T ) = π u (Z 1 ) with π u (x, s) = π x (x)π s (s) being the stationary probability of U . where Q Q Q z is the transition matrix of Z; π π π z is the stationary distribution of Z. It can be also verified that Q Q Q z = G G G(0), π π π z = v v v(0), π π π z = Q Q Q z π π π z , dv v v(m) dm z = π z (z) log π z (z) π u (z) , for z ∈ Z , π z (x ′ , s ′ )q z (x, s|x ′ , s ′ ) log q z (x, s|x ′ , s ′ ) q x (x|x ′ )q s (s|s ′ ) . (A.11)