Open Access
This article is

- freely available
- re-usable

*Entropy*
**2017**,
*19*(12),
678;
https://doi.org/10.3390/e19120678

Article

Information Landscape and Flux, Mutual Information Rate Decomposition and Connections to Entropy Production

^{1}

State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Changchun, Jilin 130022, China

^{2}

Department of Chemistry and Physics, State University of New York, Stony Brook, NY 11794, USA

^{*}

Author to whom correspondence should be addressed.

Received: 29 September 2017 / Accepted: 6 December 2017 / Published: 11 December 2017

## Abstract

**:**

We explored the dynamics of two interacting information systems. We show that for the Markovian marginal systems, the driving force for information dynamics is determined by both the information landscape and information flux. While the information landscape can be used to construct the driving force to describe the equilibrium time-reversible information system dynamics, the information flux can be used to describe the nonequilibrium time-irreversible behaviors of the information system dynamics. The information flux explicitly breaks the detailed balance and is a direct measure of the degree of the nonequilibrium or time-irreversibility. We further demonstrate that the mutual information rate between the two subsystems can be decomposed into the equilibrium time-reversible and nonequilibrium time-irreversible parts, respectively. This decomposition of the Mutual Information Rate (MIR) corresponds to the information landscape-flux decomposition explicitly when the two subsystems behave as Markov chains. Finally, we uncover the intimate relationship between the nonequilibrium thermodynamics in terms of the entropy production rates and the time-irreversible part of the mutual information rate. We found that this relationship and MIR decomposition still hold for the more general stationary and ergodic cases. We demonstrate the above features with two examples of the bivariate Markov chains.

Keywords:

nonequilibrium thermodynamics; landscape-flux decomposition; mutual information rate; entropy production rate## 1. Introduction

There is growing interest in studying two interacting information systems in the fields of control theory, information theory, communication theory, nonequilibrium physics and biophysics [1,2,3,4,5,6,7,8,9]. Significant progresses has been made recently towards the understanding of the information system in terms of information thermodynamics [10,11,12,13]. However, the identification of the global driving forces for the information system dynamics is still challenging. Here, we aim to fill this gap by quantifying the driving forces for the information system dynamics. Inspired by the recent development of landscape and flux theory for the continuous nonequilibrium systems [14,15,16] and the Markov chain decomposition dynamics for the discrete systems [17,18,19,20,21,22,23], we show that at least for the underlying marginal Markovian cases, the driving force for information dynamics is determined by both the information landscape and information flux. The information landscape can be used to construct the driving force responsible for the equilibrium time-reversible part of the information dynamics. The information flux explicitly breaks the detailed balance and provides a quantitative measure of the degree of nonequilibrium or time-irreversibility. It is responsible for the time-irreversible part of the information dynamics. The Mutual Information Rate (MIR) [24] represents the correlation between two information subsystems. We uncovered that the MIR between the two subsystems can be decomposed into the time-reversible and time-irreversible parts, respectively. Especially when the two subsystems act as Markov chains, this decomposition can be expressed in terms of information landscape-flux decomposition for Markovian dynamics. An important signature of nonequilibrium is the Entropy Production Rate (EPR) [17,25,26]. We also uncover the intimate relation between the EPRs and the time-irreversible part of the MIR. We demonstrate the above features with two cases of the bivariate Markov chains. Furthermore, we show that the decomposition of the MIR and the relationship between the EPRs and the time-irreversible part of the MIR still hold for more general stationary and ergodic cases.

## 2. Bivariate Markov Chains

Markov chains have been often assumed for the underlying dynamics of the total system in random environments. When the two subsystems together jointly form a Markov chain in continuous or discrete time, the resulting chain is called the Bivariate Markov Chain (BMC, a special case of the multivariate Markov chain with two stochastic variables). The processes of the two subsystems are correspondingly said to be marginal processes or a marginal chain. The BMC was used to model ion channel currents [2]. It was also used to model delays and congestion in a computer network [3]. Recently, different models of BMC appeared in nonequilibrium statistical physics for capturing or implementing Maxwell’s demon [4,5,6], which can be seen as one marginal chain in the BMC playing feedback control to the other marginal chain. Although the BMC has been studied for decades, there are still challenges on quantifying the dynamics of the whole, as well as the two subsystems. This is because neither of them needs to be a Markovian chain in general [7], and the quantifications of the probabilities (densities) for the trajectories of the two subsystems involve the complicated random matrix multiplications [8]. This leads to the problem not exactly being analytically solvable. The corresponding numerical solutions often lack direct mathematical and physical interpretations.

The conventional analysis of the BMC focuses on the mutual information [9] of the two subsystems for quantifying the underlying information correlations. There are three main representations of this. The first one was proposed and emphasized in the works of Sagawa, T. and Ueda, M. [11] and Parrondo, J. M. R., Horowitz, J. M. and Sagawa, T. [10], respectively, for explaining the mechanism of Maxwell’s demon in Szilard’s engine. In this representation, the mutual information between the demon and controlled system characterizes the observation and the feedback of the demon. This leads to an elegant approach, which includes the increment of the mutual information into a unified fluctuation relation. The second representation was proposed by the work of Horowitz, J. M. and Esposito, M. [12] in an attempt to explain the violation of the second law in a specified BMC, the bipartite model, where the mutual information is divided into two parts corresponding to the two subsystems, respectively, which were said to be the information flows. This representation tries to explain the mechanism of the demon because one can see that the information flows do contribute to the entropy production for both the demon and controlled system. The first two representations are based on the ensembles of the subsystem states. This means that the mutual information is defined only on the time-sliced distributions of the system states, which somehow lack the information of subsystem dynamics: the time-correlations of the observation and feedback of the demon. The last representation was seen in the work of Barato, A. C., Hartich, D. and Seifert, U. [13], where a more general definition of mutual information in information theory was used, which is defined on the trajectories of the two subsystems. More exactly, this is the so-called Mutual Information Rate (MIR) [24], which quantifies the correlation between the two subsystem dynamics. However, due to the difficulties from the possible underlying non-Markovian property of the marginal chains, exactly solvable models and comprehensive conclusions are still challenging from this representation.

In this study, we study the discrete-time BMC in both stochastic information dynamics and thermodynamics. To avoid the technical difficulty caused by non-Markovian dynamics, we first assume that the two marginal chains follow the Markovian dynamics. The non-Markovian case will be discussed elsewhere. We explore the time-irreversibility of BMC and marginal processes in the steady state. Then, we decompose the driving force for the underlying dynamics as the information landscape and information flux [14,15,16], which can be used to describe the time-reversible parts and time-irreversible parts, respectively. We also prove that the non-vanishing flux fully describes the time-irreversibility of BMC and marginal processes.

We focus on the mutual information rate between the two marginal chains. Since the two marginal chains are assumed to be Markov chains here, the mutual information rate is exactly analytically solvable, which can be seen as the averaged conditional correlation between the two subsystem states. Here, the conditional correlations reveal the time correlations between the past states and the future states.

Corresponding to the landscape-flux decomposition in stochastic dynamics, we decompose the MIR into two parts: the time-reversible and time-irreversible parts, respectively. The time-reversible part measures the part of the correlations between the two marginal chains in both forward and backward processes of BMC. The time-irreversible part measures the difference between the correlations in forward and backward processes of BMC, respectively. We can see that a non-vanishing time-irreversible part of the MIR must be driven by a non-vanishing flux in the steady state, and it can be seen as the sufficient condition for a BMC to be time-irreversible.

We also reveal the important fact that the time-irreversible parts of MIR contribute to the nonequilibrium Entropy Production Rate (EPR) of the BMC by the simple equality:

$$EPR\phantom{\rule{3.33333pt}{0ex}}of\phantom{\rule{3.33333pt}{0ex}}BMC=EPR\phantom{\rule{3.33333pt}{0ex}}of\phantom{\rule{3.33333pt}{0ex}}1st\phantom{\rule{3.33333pt}{0ex}}marginal\phantom{\rule{3.33333pt}{0ex}}chain+EPR\phantom{\rule{3.33333pt}{0ex}}of\phantom{\rule{3.33333pt}{0ex}}2nd\phantom{\rule{3.33333pt}{0ex}}marginal\phantom{\rule{3.33333pt}{0ex}}chain+2\times \mathrm{time}-\mathrm{irreversible}\phantom{\rule{3.33333pt}{0ex}}part\phantom{\rule{3.33333pt}{0ex}}ofMIR.$$

The decomposition of the MIR and the relation between the time-irreversible part of MIR and EPRs can also be found in stationary and ergodic non-Markovian cases, which will be given in the discussions in the Appendix. This may help to develop a general theory of nonequilibrium non-Markovian interacting information systems.

## 3. Information Landscape and Information Flux for Determining the Information Dynamics, Time-Irreversibility

Consider the case that two interacting information systems form a finite-state, discrete-time, ergodic and irreducible bivariate Markov chain,
We assume that the information state space of X is given by $\mathcal{X}=\{1,\dots ,d\}$ and the information state space of S is given by $\mathcal{S}=\{1,\dots ,l\}$. The information state space of Z is then given by $\mathcal{Z}=\mathcal{X}\times \mathcal{S}$. The stochastic information dynamics can then be quantitatively described by the time evolution of the probability distribution of information state space Z, characterized by the following master equation (or the information system dynamics) in discrete time,
where ${p}_{z}(z;t)={p}_{z}(x,s;t)$ is the probability of observing state z (or joint probability of $X=x$ and $S=s$) at time t; ${q}_{z}\left(z\right|{z}^{\prime})={q}_{z}(x,s|{x}^{\prime},{s}^{\prime})\ge 0$ are the transition probabilities from ${z}^{\prime}=({x}^{\prime},{s}^{\prime})$ to $z=(x,s)$, respectively, and have ${\sum}_{z}{q}_{z}\left(z\right|{z}^{\prime})=1$.

$$\begin{array}{c}\hfill Z=(X,S)=\left\{\right(X\left(t\right),S\left(t\right)),t\ge 1\},\end{array}$$

$$\begin{array}{c}\hfill {p}_{z}(z;t+1)=\sum _{{z}^{\prime}}{q}_{z}\left(z\right|{z}^{\prime}){p}_{z}({z}^{\prime};t),\phantom{\rule{4pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}t\ge 1,\phantom{\rule{4.pt}{0ex}}\mathrm{and}\phantom{\rule{4.pt}{0ex}}z\in \mathcal{Z}\end{array}$$

We assume that there exists a unique stationary distribution ${\mathsf{\pi}}_{z}$ such that ${\mathsf{\pi}}_{z}\left(z\right)={\sum}_{{z}^{\prime}}{q}_{z}\left(z\right|{z}^{\prime}){\mathsf{\pi}}_{z}\left({z}^{\prime}\right)$. Then, given an arbitrary initial probability distribution, the probability distribution goes to ${\mathsf{\pi}}_{z}$ exponentially fast in time. If the initial distribution is ${\mathsf{\pi}}_{z}$, we say that Z is in Steady State (SS), and our discussion is based on this SS.

The marginal chains of Z, i.e., X and S, do not need to be Markov chains in general. For the simplicity of analysis, we assume that both marginal chains are Markov chains, and the corresponding transition probabilities are given by ${q}_{x}\left(x\right|{x}^{\prime})$ and ${q}_{s}\left(s\right|{s}^{\prime})$ (for $x,{x}^{\prime}\in \mathcal{X}$ and $s,{s}^{\prime}\in \mathcal{S}$), respectively. Then, we have the following master equations (or the information system dynamics) for X and S, respectively,
and,
where ${p}_{x}(x;t)$ and ${p}_{s}(s;t)$ are the probabilities of observing $X=x$ and $S=s$ at time t, respectively.

$$\begin{array}{c}\hfill {p}_{x}(x;t+1)=\sum _{{x}^{\prime}}{q}_{x}\left(x\right|{x}^{\prime}){p}_{x}({x}^{\prime};t),\end{array}$$

$$\begin{array}{c}\hfill {p}_{s}(s;t+1)=\sum _{{s}^{\prime}}{q}_{s}\left(s\right|{s}^{\prime}){p}_{s}({s}^{\prime};t),\end{array}$$

We consider that both Equations (3) and (4) have unique stationary solutions ${\mathsf{\pi}}_{x}$ and ${\mathsf{\pi}}_{s}$, which satisfy ${\mathsf{\pi}}_{x}\left(x\right)={\sum}_{{x}^{\prime}}{q}_{x}\left(x\right|{x}^{\prime}){\mathsf{\pi}}_{x}\left({x}^{\prime}\right)$ and ${\mathsf{\pi}}_{s}\left(s\right)={\sum}_{{s}^{\prime}}{q}_{s}\left(s\right|{s}^{\prime}){\mathsf{\pi}}_{s}\left({s}^{\prime}\right)$ respectively. Furthermore, we assume that when Z is in SS, ${\mathsf{\pi}}_{x}$ and ${\mathsf{\pi}}_{s}$ are also achieved. The relations between ${\mathsf{\pi}}_{x}$, ${\mathsf{\pi}}_{s}$ and ${\mathsf{\pi}}_{z}$ read,

$$\begin{array}{c}\hfill \left\{\begin{array}{c}{\mathsf{\pi}}_{x}\left(x\right)={\sum}_{s}{\mathsf{\pi}}_{z}(x,s),\hfill \\ {\mathsf{\pi}}_{s}\left(s\right)={\sum}_{x}{\mathsf{\pi}}_{z}(x,s).\hfill \end{array}\right.\end{array}$$

In the rest of this paper, we let ${X}^{T}=\{X\left(1\right),X\left(2\right),\dots ,X\left(T\right)\}$, ${S}^{T}=\{S\left(1\right),S\left(2\right),\dots S\left(T\right)\}$, and ${Z}^{T}=\{Z\left(1\right),Z\left(2\right),\dots ,Z\left(T\right)\}=({X}^{T},{S}^{T})$ denote the time sequences of X, S and Z in time T, respectively.

To characterize the time-irreversibility of the Markov chain C in information dynamics in SS, we introduce the concept of probability flux. Here, we let C denote the arbitrary Markov chain in $\{Z,X,S\}$, and let c, ${\mathsf{\pi}}_{c}$, ${q}_{c}$ and ${C}^{T}$ denote arbitrary state of C, the stationary distribution of C, the transition probabilities of C and a time sequence of C in time T and in SS, respectively.

The averaged number transitions from the state ${c}^{\prime}$ to state c, denoted by $N({c}^{\prime}\to c)$, in unit time in SS can be obtained as:
This is also the probability of the time sequence ${C}^{T}=\{C\left(1\right)={c}^{\prime},C\left(2\right)=c\}$, ($T=2$). Correspondingly, the averaged number of reverse transitions, denoted by $N(c\to {c}^{\prime})$, reads:
This is also the probability of the time-reverse sequence ${\tilde{C}}^{T}=\{C\left(1\right)=c,C\left(2\right)={c}^{\prime}\}$, ($T=2$). The difference between these two transition numbers measures the time-reversibility of the forward sequence ${C}^{T}$ in SS,
Then, ${J}_{c}({c}^{\prime}\to c)$ is said to be the probability flux from ${c}^{\prime}$ to c in SS. If ${J}_{c}({c}^{\prime}\to c)=0$ for arbitrary ${c}^{\prime}$ and c, then ${C}^{T}$ ($T=2$) is time-reversible; otherwise, when ${J}_{c}({c}^{\prime}\to c)\ne 0$, ${C}^{T}$ is time-irreversible. Clearly, we have from Equation (6) that:

$$\begin{array}{c}\hfill N({c}^{\prime}\to c)={\mathsf{\pi}}_{c}\left({c}^{\prime}\right){q}_{c}\left(c\right|{c}^{\prime}).\end{array}$$

$$\begin{array}{c}\hfill N(c\to {c}^{\prime})={\mathsf{\pi}}_{c}\left(c\right){q}_{c}\left({c}^{\prime}\right|c).\end{array}$$

$$\begin{array}{ccc}\hfill {J}_{c}({c}^{\prime}\to c)& =& N({c}^{\prime}\to c)-N(c\to {c}^{\prime})\hfill \\ & =& P\left({C}^{T}\right)-P\left({\tilde{C}}^{T}\right)\hfill \\ & =& {\mathsf{\pi}}_{c}\left({c}^{\prime}\right){q}_{c}\left(c\right|{c}^{\prime})-{\mathsf{\pi}}_{c}\left(c\right){q}_{c}\left({c}^{\prime}\right|c),\phantom{\rule{4.pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}C=X,S,\phantom{\rule{4.pt}{0ex}}\mathrm{or}\phantom{\rule{4.pt}{0ex}}Z.\hfill \end{array}$$

$$\begin{array}{c}\hfill {J}_{c}({c}^{\prime}\to c)=-{J}_{c}(c\to {c}^{\prime}).\end{array}$$

The transition probability determines the evolution dynamics of the information system. We can decompose the transition probabilities ${q}_{c}\left(c\right|{c}^{\prime})$ into two parts: the time-reversible part ${D}_{c}$ and time-irreversible part ${B}_{c}$, which read:

$$\begin{array}{c}{q}_{c}\left(c\right|{c}^{\prime})={D}_{c}({c}^{\prime}\to c)+{B}_{c}({c}^{\prime}\to c),\phantom{\rule{4pt}{0ex}}\mathrm{with}\phantom{\rule{4.0pt}{0ex}}\hfill \\ \left\{\begin{array}{c}{D}_{c}({c}^{\prime}\to c)=\frac{1}{2{\mathsf{\pi}}_{c}\left({c}^{\prime}\right)}({\mathsf{\pi}}_{c}\left({c}^{\prime}\right){q}_{c}\left(c\right|{c}^{\prime})+{\mathsf{\pi}}_{c}\left(c\right){q}_{c}\left({c}^{\prime}\right|c)),\phantom{\rule{4.0pt}{0ex}}\hfill \\ {B}_{c}({c}^{\prime}\to c)=\frac{1}{2{\mathsf{\pi}}_{c}\left({c}^{\prime}\right)}{J}_{c}({c}^{\prime}\to c).\hfill \end{array}\right.\hfill \end{array}$$

From this decomposition, we can see that the information system dynamics is determined by two driving forces. One of the driving forces is determined by the steady state probability distribution. This part of the driving force is time-reversible. The other driving force for the information dynamics is the steady state probability flux, which breaks the detailed balance and quantifies the time-irreversibility. Since the steady state probability distribution measures the weight of the information state, therefore it can be used to quantify the information landscape. If we define the potential landscape for the information system as $\varphi =-\mathrm{log}\mathsf{\pi}$, then the driving force ${D}_{c}({c}^{\prime}\to c)=\frac{1}{2}({q}_{c}\left(c\right|{c}^{\prime})+\frac{{\mathsf{\pi}}_{c}\left(c\right)}{{\mathsf{\pi}}_{c}\left({c}^{\prime}\right)}{q}_{c}\left({c}^{\prime}\right|c))=\frac{1}{2}({q}_{c}\left(c\right|{c}^{\prime})+\mathrm{exp}[-({\varphi}_{c}\left(c\right)-{\varphi}_{c}\left({c}^{\prime}\right)]{q}_{c}\left({c}^{\prime}\right|c))$ is expressed in term of the difference of the potential landscape. This is analogous to the landscape-flux decomposition of Langevin dynamics in [15]. Notice that the information landscape is directly related to the steady state probability distribution of the information system. In general, the information landscape is at nonequilibrium since the detailed balance is often broken for general cases. Only when the detailed balance is preserved, the nonequilibrium information landscape is reduced to the equilibrium information landscape. Even though the information landscape is not at equilibrium in general, the driving force ${D}_{c}({c}^{\prime}\to c)$ is time-reversible due to the decomposition construction. The steady state probability flux measures the information flow in the dynamics and therefore can be termed as the information flux. In fact, the nonzero information flux explicitly breaks the detailed balance because of the net flow to or from the system. It is therefore a direct measure of the degree of the nonequilibrium or time-irreversibility in terms of the detailed balance breaking.

Note that the decomposition for the discrete Markovian information process can be viewed as the separation of the current corresponding to the $2{B}_{c}({c}^{\prime}\to c){\mathsf{\pi}}_{c}\left({c}^{\prime}\right)$ here and the activity corresponding to the $2{D}_{c}({c}^{\prime}\to c){\mathsf{\pi}}_{c}\left({c}^{\prime}\right)$ in a previous study [19]. The landscape and flux decomposition here for the reduced information dynamics are in a similar spirit as the whole state space decomposition with the information system and the associated environments. When the detailed balance is broken, the information landscape (defined as the negative logarithm of the steady state probability $\varphi =-\mathrm{log}\mathsf{\pi}$) is not the same as the equilibrium landscape under the detailed balance. There can be uniqueness issue related to the decomposition. To avoid the confusion, we make a physical choice, or in other words, we can fix the gauge so that the information landscape always coincides with the equilibrium landscape when the detailed balance is satisfied. In other words, we want to make sure the Boltzmann law applies at equilibrium with detailed balance. In this way, we can decompose the information landscape and information flux for nonequilibrium information systems without detailed balance. By solving the linear master equation for the steady state, we can quantify the nonequilibrium information landscape, and from that, we can obtain the corresponding steady state probability flux. Some studies discussed various aspects of this issue [18,19,27,28].

By Equations (7) and (8), we have the following relations:
As we can see in the next section, ${D}_{c}$ and ${B}_{c}$ are useful for us to quantify time-reversible and time-irreversible observables of C, respectively.

$$\begin{array}{c}\hfill \left\{\begin{array}{c}{\mathsf{\pi}}_{c}\left({c}^{\prime}\right){D}_{c}({c}^{\prime}\to c)={\mathsf{\pi}}_{c}\left(c\right){D}_{c}(c\to {c}^{\prime}),\hfill \\ {\mathsf{\pi}}_{c}\left({c}^{\prime}\right){B}_{c}({c}^{\prime}\to c)=-{\mathsf{\pi}}_{c}\left(c\right){B}_{c}(c\to {c}^{\prime}).\hfill \end{array}\right.\end{array}$$

We give the interpretation that the non-vanishing information flux ${J}_{c}$ fully measures the time-irreversibility of the chain C in time T for $T\ge 2$. Let ${C}^{T}$ be an arbitrary sequence of C in SS, and without loss of generality, we let $T=3$. Similar to Equation (6), the measure of the time-irreversibility of ${C}^{T}$ can be given by the difference between the probability of ${C}^{T}=\{C\left(1\right),C\left(2\right),C\left(3\right)\}$ and that of its time-reversal ${\tilde{C}}^{T}=\{C\left(3\right),C\left(2\right),C\left(1\right)\}$, such as:
Then, by the relations given in Equation (9), we have that $P({C}^{T})-P({\tilde{C}}^{T})=0$ holds for arbitrary ${C}^{T}$ if and only if ${B}_{c}(C\left(1\right)\to C\left(2\right))={B}_{c}(C\left(2\right)\to C\left(3\right))=0$ or equivalently ${J}_{c}(C\left(1\right)\to C\left(2\right))={J}_{c}(C\left(2\right)\to C\left(3\right))=0$. This conclusion can be made for arbitrary $T>3$. Thus, non-vanishing ${J}_{c}$ can fully describe the time-irreversibility of C for $C=X,S$ or Z.

$$\begin{array}{c}P\left({C}^{T}\right)-P\left({\tilde{C}}^{T}\right)\hfill \\ ={\mathsf{\pi}}_{c}\left(C\left(1\right)\right){q}_{c}\left(C\left(2\right)\right|C\left(1\right)){q}_{c}\left(C\left(3\right)\right|C\left(2\right))-{\mathsf{\pi}}_{c}\left(C\left(3\right)\right){q}_{c}\left(C\left(2\right)\right|C\left(3\right)){q}_{c}\left(C\left(1\right)\right|C\left(2\right))\hfill \\ ={\mathsf{\pi}}_{c}\left(C\left(1\right)\right)\left({D}_{c}(C\left(1\right)\to C\left(2\right))+{B}_{c}(C\left(1\right)\to C\left(2\right))\right)\left({D}_{c}(C\left(2\right)\to C\left(3\right))+{B}_{c}(C\left(2\right)\to C\left(3\right))\right)-\hfill \\ {\mathsf{\pi}}_{c}\left(C\left(3\right)\right)\left({D}_{c}(C\left(3\right)\to C\left(2\right))+{B}_{c}(C\left(3\right)\to C\left(2\right))\right)\left({D}_{c}(C\left(2\right)\to C\left(1\right))+{B}_{c}(C\left(2\right)\to C\left(1\right))\right),\hfill \\ \phantom{\rule{4.pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}C=X,S\phantom{\rule{4.pt}{0ex}}\mathrm{or}\phantom{\rule{4.pt}{0ex}}Z.\hfill \end{array}$$

We show the relations between the fluxes of the whole system ${J}_{z}$ and of the subsystem ${J}_{x}$ as follows:
Similarly, we have:
These relations indicate that the subsystem fluxes ${J}_{x}$ and ${J}_{s}$ can be seen as the coarse-grained levels of total system flux ${J}_{z}$ by averaging over the other parts of the system S and X, respectively. We should emphasize that non-vanishing ${J}_{z}$ does not mean X or S is time-irreversible and vice versa.

$$\begin{array}{ccc}\hfill {J}_{x}({x}^{\prime}\to x)& =& {\mathsf{\pi}}_{x}\left({x}^{\prime}\right){q}_{x}\left(x\right|{x}^{\prime})-{\mathsf{\pi}}_{x}\left(x\right){q}_{x}\left({x}^{\prime}\right|x)\hfill \\ & =& P\left(\{{x}^{\prime},x\}\right)-P\left(\{x,{x}^{\prime}\}\right)\hfill \\ & =& \sum _{s,{s}^{\prime}}\left(P\left(\{({x}^{\prime},{s}^{\prime}),(x,s)\}\right)-P\left(\{(x,s),({x}^{\prime},{s}^{\prime})\}\right)\right)\hfill \\ & =& \sum _{s,{s}^{\prime}}\left({\mathsf{\pi}}_{z}({x}^{\prime},{s}^{\prime}){q}_{z}(x,s|{x}^{\prime},{s}^{\prime})-{\mathsf{\pi}}_{z}(x,s){q}_{z}({x}^{\prime},{s}^{\prime}|x,s)\right)\hfill \\ & =& \sum _{s,{s}^{\prime}}{J}_{z}(({x}^{\prime},{s}^{\prime})\to (x,s)).\hfill \end{array}$$

$$\begin{array}{c}\hfill {J}_{s}({s}^{\prime}\to s)=\sum _{x,{x}^{\prime}}{J}_{z}(({x}^{\prime},{s}^{\prime})\to (x,s)).\end{array}$$

## 4. Mutual Information Decomposition to Time-Reversible and Time-Irreversible Parts

According to information theory, the two interacting information systems represented by bivariate Markov chain Z can be characterized by the Mutual Information Rate (MIR) between the marginal chains X and S in SS. The mutual information rates represent the correlation between two interacting information systems. The MIR is defined on the probabilities of all possible time sequences, $P\left({Z}^{T}\right)$, $P\left({X}^{T}\right)$ and $P\left({S}^{T}\right)$ and is given by [24],
It measures the correlation between X and S in unit time, or say, the efficient bits of information that X and S exchange with each other in unit time. The MIR must be non-negative, and a vanishing $I(X,S)$ indicates that X and S are independent of each other. More explicitly, the corresponding probabilities of these sequences can be evaluated by using Equations (2)–(4); we have:
By substituting these probabilities into Equation (12) (see Appendix A), we have the exact expression of MIR as:
where $i\left(z\right|{z}^{\prime})=\mathrm{log}\frac{{q}_{z}\left(z\right|{z}^{\prime})}{{q}_{x}\left(x\right|{x}^{\prime}){q}_{s}\left(s\right|{s}^{\prime})}$ is the conditional (Markovian) correlation between the states x and s when the transition ${z}^{\prime}=({x}^{\prime},{s}^{\prime})\to z=(x,s)$ occurs. This indicates that when the two marginal processes are both Markovian, the MIR is the average of the conditional (Markovian) correlations. These correlations are measurable when transitions occur, and they can be seen as the observables of Z.

$$\begin{array}{c}\hfill I(X,S)=\underset{T\to \infty}{\mathrm{lim}}\frac{1}{T}\sum _{{Z}^{T}}P\left({Z}^{T}\right)\mathrm{log}\frac{P\left({Z}^{T}\right)}{P\left({X}^{T}\right)P\left({S}^{T}\right)}.\end{array}$$

$$\begin{array}{c}\hfill \left\{\begin{array}{c}P\left({X}^{T}\right)={\mathsf{\pi}}_{x}\left(X\left(1\right)\right)\prod _{t=1}^{T-1}{q}_{x}\left(X(t+1)\right|X\left(t\right)),\hfill \\ P\left({S}^{T}\right)={\mathsf{\pi}}_{s}\left(S\left(1\right)\right)\prod _{t=1}^{T-1}{q}_{s}\left(S(t+1)\right|S\left(t\right)),\hfill \\ P\left({Z}^{T}\right)={\mathsf{\pi}}_{z}\left(Z\left(1\right)\right)\prod _{t=1}^{T-1}{q}_{z}\left(Z(t+1)\right|Z\left(t\right)).\hfill \end{array}\right.\end{array}$$

$$\begin{array}{ccc}I(X,S)\hfill & =\hfill & \sum _{z,{z}^{\prime}}{\mathsf{\pi}}_{z}\left({z}^{\prime}\right){q}_{z}\left(z\right|{z}^{\prime})\mathrm{log}\frac{{q}_{z}\left(z\right|{z}^{\prime})}{{q}_{x}\left(x\right|{x}^{\prime}){q}_{s}\left(s\right|{s}^{\prime})}\phantom{\rule{4.0pt}{0ex}}\hfill \\ & =\hfill & \langle i\left(z\right|{z}^{\prime}){\rangle}_{{z}^{\prime},z}\ge 0,\phantom{\rule{4.pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}z=(x,s),\phantom{\rule{4.pt}{0ex}}\mathrm{and}\phantom{\rule{4.pt}{0ex}}{z}^{\prime}=({x}^{\prime},{s}^{\prime}).\hfill \end{array}$$

By noting the decomposition of transition probabilities in Equation (8), we have a corresponding decomposition of $I(X,S)$ such as:
This means that the mutual information representing the correlations between the two interacting systems can be decomposed into the time-reversible equilibrium part and the time-irreversible nonequilibrium part. The origin of this is from the fact that the underlying information system dynamics is determined by both the time-reversible information landscape and time-irreversible information flux. These equations are very important to establish the link to the time-irreversibility. We now give further interpretation for ${I}_{D}(X,S)$ and ${I}_{B}(X,S)$:

$$\begin{array}{c}I(X,S)={I}_{D}(X,S)+{I}_{B}(X,S),\phantom{\rule{4pt}{0ex}}\mathrm{with}\phantom{\rule{4.0pt}{0ex}}\hfill \\ \left\{\begin{array}{c}{I}_{D}(X,S)=\sum _{z,{z}^{\prime}}{\mathsf{\pi}}_{z}\left({z}^{\prime}\right){D}_{z}\left(z\right|{z}^{\prime})i\left(z\right|{z}^{\prime})=\frac{1}{2}\sum _{z,{z}^{\prime}}({\mathsf{\pi}}_{z}\left({z}^{\prime}\right){q}_{z}\left(z\right|{z}^{\prime})+{\mathsf{\pi}}_{z}\left(z\right){q}_{z}\left({z}^{\prime}\right|z))i\left(z\right|{z}^{\prime}),\phantom{\rule{4.0pt}{0ex}}\hfill \\ {I}_{B}(X,S)=\sum _{z,{z}^{\prime}}{\mathsf{\pi}}_{z}\left({z}^{\prime}\right){B}_{z}\left(z\right|{z}^{\prime})i\left(z\right|{z}^{\prime})=\frac{1}{2}\sum _{z,{z}^{\prime}}{J}_{z}\left(z\right|{z}^{\prime})i\left(z\right|{z}^{\prime})=\frac{1}{4}\sum _{z,{z}^{\prime}}{J}_{z}\left(z\right|{z}^{\prime})(i\left(z\right|{z}^{\prime})-i\left({z}^{\prime}\right|z)).\hfill \end{array}\right.\hfill \end{array}$$

Consider a bivariate Markov chain Z in SS wherein X and S are dependent on each other, i.e., $I(X,S)={I}_{D}(X,S)+{I}_{B}(X,S)>0$. By the ergodicity of Z, we have the MIR, which measures the averaged conditional correlation along the time sequences ${Z}^{T}$,
Then, ${I}_{B}(X,S)$ measures the change of the averaged conditional correlation between X and S when a sequence of Z turns back in time,
A negative ${I}_{B}(X,S)$ shows that the correlation between X and S becomes strong in the time-reversal process of Z; A positive ${I}_{B}(X,S)$ shows that the correlation becomes weak in the time-reversal process of Z. Both cases show that the Z is time-irreversible since we have a non-vanishing ${J}_{z}$. However, the case of ${I}_{B}(X,S)=0$ is complicated, since it indicates either a vanishing ${J}_{z}$ or a non-vanishing ${J}_{z}$. Anyway, we see that a non-vanishing ${I}_{B}(X,S)$ is a sufficient condition for Z to be time-irreversible. On the other hand, ${I}_{D}(X,S)=I(X,S)-{I}_{B}(X,S)$ measures the correlation remaining in the backward process of Z.

$$\begin{array}{c}\hfill \underset{T\to \infty}{\mathrm{lim}}\frac{1}{T}{\langle i\left(Z(t+1)\right|Z\left(t\right))\rangle}_{{Z}^{T}}=I(X,S),\phantom{\rule{4.pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}1<t<T.\end{array}$$

$$\begin{array}{c}\hfill \underset{T\to \infty}{\mathrm{lim}}\frac{1}{T}{\u2329i\left(Z\right(t+1\left)\right|Z\left(t\right))-i(Z\left(t\right)\left|Z\right(t+1\left)\right)\u232a}_{{Z}^{T}}=2{I}_{B}(X,S).\end{array}$$

The definition of MIR in Equation (12) turns out to be appropriate for even more general stationary and ergodic (Markovian or non-Markovian) processes. Consequentially, the decomposition of MIR is useful to quantify the correlation between two stationary and ergodic processes in a wider sense, i.e., to monitor the changes of the correlation in the forward and the backward processes. As a special case, the analytical expressions in Equation (14) are the reduced results, which are valid for Markovian cases. A brief discussion of the decomposition of MIR of more general processes can be found in Appendix B.

## 5. Relationship between Mutual Information and Entropy Production

The Entropy Production Rates (EPR) or energy dissipation (cost) rate at steady state is a quantitative nonequilibrium measure, which characterizes the time-irreversibility of the underlying processes. The EPR of a stationary and ergodic process C (here $C=Z,X$ or S) can be given by the difference between the averaged surprisal (negative logarithmic probability) of the backward sequences ${\tilde{C}}^{T}$ and that of forward sequences ${C}^{T}$ in the long time limit, i.e.,
where ${R}_{c}$ is said to be the EPR of C [25]; $-\mathrm{log}P({C}^{T})$ and $-\mathrm{log}P({\tilde{C}}^{T})$ are said to be the surprisal of a forward and a backward sequence of C, respectively. We see that C is time-reversible (i.e., $P({C}^{T})=P({\tilde{C}}^{T})$ for arbitrary ${C}^{T}$ for large T) if and only if ${R}_{c}=0$. Additionally, this is due to the form of ${R}_{c}$, which is exactly a Kullback–Leibler divergence. When C is Markovian, then ${R}_{c}$ reduces into the following form when Z, X or S is assigned to C, respectively [17,26],
where total and subsystem entropy productions ${R}_{z}$, ${R}_{x}$ and ${R}_{s}$ correspond to Z, X and S, respectively. Here, ${R}_{z}$ usually contains the detailed interaction information of the system (or subsystems) and environments; ${R}_{x}$ and ${R}_{s}$ provide the coarse-grained information of time-irreversible observables of X and Z, respectively. Each non-vanishing EPR indicates that the corresponding Markov chain is time-irreversible. Again, we emphasize that a non-vanishing ${R}_{z}$ does not mean X or S is time-irreversible and vice versa.

$$\begin{array}{cc}\hfill {R}_{c}& =\underset{T\to \infty}{\mathrm{lim}}\frac{1}{T}{\u2329\mathrm{log}P({C}^{T})-\mathrm{log}P({\tilde{C}}^{T})\u232a}_{{C}^{T}}\hfill \\ & =\underset{T\to \infty}{\mathrm{lim}}\frac{1}{T}{\u2329\mathrm{log}\frac{P({C}^{T})}{P({\tilde{C}}^{T})}\u232a}_{{C}^{T}}\ge 0,\hfill \end{array}$$

$$\begin{array}{c}\hfill \left\{\begin{array}{c}{R}_{z}=\frac{1}{2}{\sum}_{z,{z}^{\prime}}{J}_{z}({z}^{\prime}\to z)\mathrm{log}\frac{{q}_{z}\left(z\right|{z}^{\prime})}{{q}_{z}\left({z}^{\prime}\right|z)},\hfill \\ {R}_{x}=\frac{1}{2}{\sum}_{x,{x}^{\prime}}{J}_{x}({x}^{\prime}\to x)\mathrm{log}\frac{{q}_{x}\left(x\right|{x}^{\prime})}{{q}_{x}\left({x}^{\prime}\right|x)},\hfill \\ {R}_{s}=\frac{1}{2}{\sum}_{s,{s}^{\prime}}{J}_{s}({s}^{\prime}\to s)\mathrm{log}\frac{{q}_{s}\left(s\right|{s}^{\prime})}{{q}_{s}\left({s}^{\prime}\right|s)},\hfill \end{array}\right.\end{array}$$

We are interested in the connection between these EPRs and mutual information. We can associate them with ${I}_{B}(X,S)$ by noting Equations (10), (11) and (14). We have:

$$\begin{array}{cc}\hfill {I}_{B}(X,S)& =\frac{1}{4}\sum _{z,{z}^{\prime}}{J}_{z}\left(z\right|{z}^{\prime})(i\left(z\right|{z}^{\prime})-i\left({z}^{\prime}\right|z))\hfill \\ & =\frac{1}{4}\sum _{z,{z}^{\prime}}{J}_{z}\left(z\right|{z}^{\prime})\mathrm{log}\frac{{q}_{z}\left(z\right|{z}^{\prime})}{{q}_{z}\left({z}^{\prime}\right|z)}-\frac{1}{4}\sum _{x,{x}^{\prime}}{J}_{x}\left(x\right|{x}^{\prime})\mathrm{log}\frac{{q}_{x}\left(x\right|{x}^{\prime})}{{q}_{x}\left({x}^{\prime}\right|x)}-\frac{1}{4}\sum _{s,{s}^{\prime}}{J}_{s}\left(s\right|{s}^{\prime})\mathrm{log}\frac{{q}_{s}\left(s\right|{s}^{\prime})}{{q}_{s}\left({s}^{\prime}\right|s)}\hfill \\ & =\frac{1}{2}({R}_{z}-{R}_{x}-{R}_{s}).\hfill \end{array}$$

We note that ${I}_{B}(X,S)$ is intimately related to the EPRs. This builds up a bridge between these EPRs and the irreversible part of the mutual information. Moreover, we also have:
This indicates that the time-irreversible MIR contributes to the detailed EPRs. In other words, the differences of the entropy production rate of the whole system and subsystems provide the origin of the time-irreversible part of the mutual information. This reveals the nonequilibrium thermodynamic origin of the irreversible mutual information or correlations. Of course, since the EPR is related to the flux directly as is seen from the above definitions, the origin of the EPR or nonequilibrium thermodynamics is from the non-vanishing information flux for the nonequilibrium dynamics. On the other hand, the irreversible part of the mutual information measures the correlations, and it contributes to the EPRs of the correlated subsystems.

$$\begin{array}{c}\hfill \left\{\begin{array}{c}{R}_{z}={R}_{x}+{R}_{s}+2{I}_{B}(X,S)\ge 0,\hfill \\ {R}_{x}+{R}_{s}\ge -2{I}_{B}(X,S),\hfill \\ {R}_{z}\ge 2{I}_{B}(X,S).\hfill \end{array}\right.\end{array}$$

Furthermore, the last expression in Equation (17) (also the expressions in Equation (18)) can be generalized to more general stationary and ergodic processes. A related discussion and demonstration of this can be seen in Appendix B.

## 6. A Simple Case: The Blind Demon

As a concrete example, we consider a two-state system coupled to two information baths a and b. The states of the system are denoted by $\mathcal{X}=\{x:x=0,1\}$, respectively. Each bath sends an instruction to the system. If the system adopts one of them, it then follows the instruction and makes the change of the state. The instructions generated from one bath are independently and identically distributed (Bernoulli trials). Both the probability distributions of the instructions corresponding to the baths follow Bernoulli distributions and read $\{{\u03f5}_{a}\left(x\right):x\in \mathcal{X},{\u03f5}_{a}\left(x\right)\ge 0,{\sum}_{x}{\u03f5}_{a}\left(x\right)=1\}$ for bath a and $\{{\u03f5}_{b}\left(x\right):x\in \mathcal{X},{\u03f5}_{b}\left(x\right)\ge 0,{\sum}_{x}{\u03f5}_{b}\left(x\right)=1\}$ for bath b, respectively. Since the system cannot execute two instructions simultaneously, there exists an information demon that makes choices for the system. The demon is blind to caring about the system, and it makes choices independently and identically distributed. The choices of the demon are denoted by $\mathcal{S}=\{s:s=a,b\}$, respectively. The probability distribution of the demon’s choices reads $\left\{P\right(s):s\in \mathcal{S},P(a)=p,P(b)=1-p,p\in [0,1\left]\right\}$. Still, we use $Z=(X,S)$ with $X\in \mathcal{X}$ and $S\in \mathcal{S}$ to denote the BMC of the system and the demon.

Consequentially, the transition probabilities of the system read:
The transition probabilities of the demon read:
Additionally, the transition probabilities of the joint chain read:
We have the corresponding steady state distributions or the information landscapes as,
We obtain the information fluxes as,
Here, we use the notations ${\u03f5}_{s}\left({x}^{\prime}\right)$ and ${\u03f5}_{{s}^{\prime}}\left(x\right)$ ($s,{s}^{\prime}=a\phantom{\rule{4.pt}{0ex}}\mathrm{or}\phantom{\rule{4.pt}{0ex}}b$) to denote the probabilities of the instructions ${x}^{\prime}$ or x from bath a or b briefly. We obtain the EPRs as:
We evaluate the MIR as:
The time-irreversible part of $I(X,S)$ reads,

$$\begin{array}{c}\hfill {q}_{x}\left(x\right|{x}^{\prime})=p{\u03f5}_{a}\left(x\right)+(1-p){\u03f5}_{b}\left(x\right).\end{array}$$

$$\begin{array}{c}\hfill {q}_{s}\left(s\right|{s}^{\prime})=P\left(s\right).\end{array}$$

$$\begin{array}{c}\hfill {q}_{z}(x,s|{x}^{\prime},{s}^{\prime})=P\left(s\right){\u03f5}_{{s}^{\prime}}\left(x\right).\end{array}$$

$$\begin{array}{c}\hfill \left\{\begin{array}{c}{\mathsf{\pi}}_{x}\left(x\right)=p{\u03f5}_{a}\left(x\right)+(1-p){\u03f5}_{b}\left(x\right),\hfill \\ {\mathsf{\pi}}_{s}\left(s\right)=P\left(s\right),\hfill \\ {\mathsf{\pi}}_{z}(x,s)=P\left(s\right){\mathsf{\pi}}_{x}\left(x\right).\hfill \end{array}\right.\end{array}$$

$$\begin{array}{c}\hfill \left\{\begin{array}{c}{J}_{x}({x}^{\prime}\to x)=0,\phantom{\rule{4.pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}\mathrm{all}\phantom{\rule{4.pt}{0ex}}x,{x}^{\prime}\in \mathcal{X}\hfill \\ {J}_{s}({s}^{\prime}\to s)=0,\phantom{\rule{4.pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}\mathrm{all}\phantom{\rule{4.pt}{0ex}}s,{s}^{\prime}\in \mathcal{S}\hfill \\ {J}_{z}(({x}^{\prime},{s}^{\prime})\to (x,s))=P\left(s\right)P\left({s}^{\prime}\right)({\mathsf{\pi}}_{x}\left({x}^{\prime}\right){\u03f5}_{{s}^{\prime}}\left(x\right)-{\mathsf{\pi}}_{x}\left(x\right){\u03f5}_{s}\left({x}^{\prime}\right)).\hfill \end{array}\right.\end{array}$$

$$\begin{array}{c}\hfill \left\{\begin{array}{c}{R}_{x}=0,\hfill \\ {R}_{s}=0,\hfill \\ {R}_{z}={\sum}_{x}p(1-p)({\u03f5}_{a}\left(x\right)-{\u03f5}_{b}\left(x\right))(\mathrm{log}{\u03f5}_{a}\left(x\right)-\mathrm{log}{\u03f5}_{b}\left(x\right)).\hfill \end{array}\right.\end{array}$$

$$\begin{array}{c}\hfill I(X,S)=-\sum _{x}{\mathsf{\pi}}_{x}\left(x\right)\mathrm{log}{\mathsf{\pi}}_{x}\left(x\right)+p\sum _{x}{\u03f5}_{a}\left(x\right)\mathrm{log}{\u03f5}_{a}\left(x\right)+(1-p)\sum _{x}{\u03f5}_{b}\left(x\right)\mathrm{log}{\u03f5}_{b}\left(x\right).\end{array}$$

$$\begin{array}{c}\hfill {I}_{B}(X,S)=\frac{1}{2}{R}_{z}.\end{array}$$

## 7. Conclusions

In this work, we identify the driving forces for the information system dynamics. We show that for marginal Markovian information systems, the information dynamics is determined by both the information landscape and information flux. While the information landscape can be used to construct the driving force for describing the time-reversible behavior of the information dynamics, the information flux can be used to describe the time-irreversible behavior of the information dynamics. The information flux explicitly breaks the detailed balance and provides a quantitative measure of the degree of the nonequilibrium or time-irreversibility. We further demonstrate that the mutual information rate, which represents the correlations, can be decomposed into the time-reversible part and the time-irreversible part originated from the landscape and flux decomposition of the information dynamics. Finally, we uncover the intimate relationship between the difference of the entropy productions of the whole system and those of the subsystems and the time-irreversible part of the mutual information. This will help with understanding the non-equilibrium behavior of the interacting information system dynamics in stochastic environments. Furthermore, we verify that our conclusions on the mutual information rate and entropy production rate decomposition can be made more general for the stationary and ergodic processes.

## Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (NSFC-91430217) and the National Science Foundation (U.S.) (NSF-PHY-76066).

## Author Contributions

Qian Zeng and Jin Wang conceived and designed the experiments; Qian Zeng performed the experiments; Qian Zeng and Jin Wang analyzed the data; Qian Zeng and Jin Wang contributed reagents/materials/analysis tools; Qian Zeng and Jin Wang wrote the paper.

## Conflicts of Interest

The authors declare no conflict of interest.

## Abbreviations

The following abbreviations are used in this manuscript:

BMC | Bivariate Markov Chain |

EPR | Entropy Production Rate |

MIR | Mutual Information Rate |

SS | Steady State |

## Appendix A

Here, we derive the exact form of the Mutual Information Rate (MIR, Equation (13)) in the steady state by using the cumulant-generating function.

We write an arbitrary time sequence of Z in time T in the following form:
where $Z\left(i\right)$ (for $i\ge 1$) denotes the state at time i. The corresponding probability of ${Z}^{T}$ is in the following form:

$$\begin{array}{c}\hfill {Z}^{T}=\{Z\left(1\right),\dots ,Z\left(i\right),\dots ,Z\left(T\right)\},\phantom{\rule{4.pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}T\ge 2,\end{array}$$

$$\begin{array}{c}\hfill P\left({Z}^{T}\right)={\mathsf{\pi}}_{z}\left({Z}_{1}\right)\left\{\prod _{i=1}^{T-1}{q}_{z}\left({Z}_{i+1}\right|{Z}_{i})\right\}.\end{array}$$

We let the chain $U=(X,S)$ denote a process that X and S follow the same Markov dynamics in Z, but are independent of each other. Then, we have that the transition probabilities of U read:
Then, the probability of a time sequence of U, ${U}^{T}$, with the same trajectory of ${Z}^{T}$ reads:
with ${\mathsf{\pi}}_{u}(x,s)={\mathsf{\pi}}_{x}\left(x\right){\mathsf{\pi}}_{s}\left(s\right)$ being the stationary probability of U.

$$\begin{array}{c}\hfill {q}_{u}\left(u\right|{u}^{\prime})=q(x,s|{x}^{\prime},{s}^{\prime})={q}_{x}\left(x\right|{x}^{\prime}){q}_{s}\left(s\right|{s}^{\prime}).\end{array}$$

$$\begin{array}{c}\hfill P\left({U}^{T}\right)={\mathsf{\pi}}_{u}\left({Z}_{1}\right)\left\{\prod _{i=1}^{T-1}{q}_{u}\left({Z}_{i+1}\right|{Z}_{i})\right\},\end{array}$$

For evaluating the exact form of MIR, we introduce the cumulant-generating function of the random variable $\mathrm{log}\frac{P\left({Z}^{T}\right)}{P\left({U}^{T}\right)}$,
We can see that:
Thus, our idea is to evaluate $K(m,T)$ at first. We have:
where we realize that the last equality can be rewritten in the form of matrix multiplication.

$$\begin{array}{c}\hfill K(m,T)=\mathrm{log}{\u2329\mathrm{exp}\left(m\mathrm{log}\frac{P\left({Z}^{T}\right)}{P\left({U}^{T}\right)}\right)\u232a}_{{Z}^{T}}.\end{array}$$

$$\begin{array}{c}\underset{T\to \infty}{\mathrm{lim}}\underset{m\to 0}{\mathrm{lim}}\frac{1}{T}\frac{\partial K(m,T)}{\partial m}\hfill \\ =\underset{T\to \infty}{\mathrm{lim}}\frac{1}{T}{\u2329\mathrm{log}\frac{P\left({Z}^{T}\right)}{P\left({U}^{T}\right)}\u232a}_{{Z}^{T}}\hfill \\ =I(X,S).\hfill \end{array}$$

$$\begin{array}{cc}\hfill K(m,T)& =\mathrm{log}{\u2329\mathrm{exp}\left(m\mathrm{log}\frac{P\left({Z}^{T}\right)}{P\left({U}^{T}\right)}\right)\u232a}_{{Z}^{T}}\hfill \\ & =\mathrm{log}\left\{\sum _{{Z}^{T}}\frac{{\left(P\left({Z}^{T}\right)\right)}^{m+1}}{{\left(P\left({U}^{T}\right)\right)}^{m}}\right\}\hfill \\ & =\mathrm{log}\left\{\sum _{\left\{Z\right(0),Z(1),\dots ,Z(T\left)\right\}}\frac{\left({\mathsf{\pi}}_{z}^{m+1}\left({Z}_{0}\right)\right)}{\left({\mathsf{\pi}}_{u}^{m}\left({Z}_{0}\right)\right)}\prod _{i=0}^{T-1}\frac{{q}_{z}^{m+1}\left({Z}_{i+1}\right|{Z}_{i})}{{q}_{u}^{m}\left({Z}_{i+1}\right|{Z}_{i})}\right\},\hfill \end{array}$$

We introduce the following matrices and vectors for Equation (A6) such that:
where ${\mathit{Q}}_{z}$ is the transition matrix of Z; ${\mathsf{\pi}}_{z}$ is the stationary distribution of Z. It can be also verified that:
where ${\mathbf{1}}^{\u2020}$ is the vector of all ones with appropriate dimension.

$$\begin{array}{c}{\mathit{Q}}_{z}=\left\{{\left({\mathit{Q}}_{\mathit{z}}\right)}_{(z,{z}^{\prime})}={q}_{z}\left(z\right|{z}^{\prime}),\phantom{\rule{4.pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}z,{z}^{\prime}\in \mathcal{Z}\right\},\hfill \\ \mathit{G}\left(m\right)=\left\{{\left(\mathit{G}\left(m\right)\right)}_{(z,{z}^{\prime})}=\frac{{q}_{z}^{m+1}\left(z\right|{z}^{\prime})}{{q}_{u}^{m}\left(z\right|{z}^{\prime})},\phantom{\rule{4.pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}z,{z}^{\prime}\in \mathcal{Z}\right\},\hfill \\ {\mathsf{\pi}}_{z}=\left\{{\left({\mathsf{\pi}}_{z}\right)}_{z}={\mathsf{\pi}}_{z}\left(z\right),\phantom{\rule{4.pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}z\in \mathcal{Z}\right\},\hfill \\ \mathit{v}\left(m\right)=\left\{{\left(\mathit{v}\left(m\right)\right)}_{z}=\frac{{\mathsf{\pi}}_{z}^{m+1}\left(z\right)}{{\mathsf{\pi}}_{u}^{m}\left(z\right)}\right\},\hfill \end{array}$$

$$\begin{array}{c}{\mathit{Q}}_{z}=\mathit{G}\left(0\right),\hfill \\ {\mathsf{\pi}}_{z}=\mathit{v}\left(0\right),\hfill \\ {\mathsf{\pi}}_{z}={\mathit{Q}}_{z}{\mathsf{\pi}}_{z},\hfill \\ {\mathbf{1}}^{\u2020}{\mathit{Q}}_{z}={\mathbf{1}}^{\u2020},\hfill \\ \underset{m\to 0}{\mathrm{lim}}\frac{d\mathit{G}\left(m\right)}{dm}=\left\{{\left(\underset{m\to 0}{\mathrm{lim}}\frac{d\mathit{G}\left(m\right)}{dm}\right)}_{(z,{z}^{\prime})}={q}_{z}\left(z\right|{z}^{\prime})\mathrm{log}\frac{{q}_{z}\left(z\right|{z}^{\prime})}{{q}_{u}\left(z\right|{z}^{\prime})},\phantom{\rule{4.pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}z,{z}^{\prime}\in \mathcal{Z}\right\},\hfill \\ \underset{m\to 0}{\mathrm{lim}}\frac{d\mathit{v}\left(m\right)}{dm}=\left\{{\left(\underset{m\to 0}{\mathrm{lim}}\frac{d\mathit{v}\left(m\right)}{dm}\right)}_{z}={\mathsf{\pi}}_{z}\left(z\right)\mathrm{log}\frac{{\mathsf{\pi}}_{z}\left(z\right)}{{\mathsf{\pi}}_{u}\left(z\right)},\phantom{\rule{4.pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}z\in \mathcal{Z}\right\},\hfill \end{array}$$

Then, $K(m,T)$ can be rewritten in a compact form such that:

$$\begin{array}{c}\hfill K(m,T)=\mathrm{log}\left\{{\mathbf{1}}^{\u2020}{\mathit{G}}^{T-1}\left(m\right)\mathit{v}\left(m\right)\right\}.\end{array}$$

Then, we substitute Equation (A9) into Equation (A5) and have:
By noting Equation (A8) and $T\ge 2$, we obtain Equation (13) from Equation (A10) such that:

$$\begin{array}{cc}\hfill I(X,S)& =\underset{T\to \infty}{\mathrm{lim}}\underset{m\to 0}{\mathrm{lim}}\frac{1}{T}\frac{\partial K(m,T)}{\partial m}\hfill \\ \\ =\underset{T\to \infty}{\mathrm{lim}}\underset{m\to 0}{\mathrm{lim}}\frac{1}{T}\frac{\partial \mathrm{log}\left\{{\mathbf{1}}^{\u2020}{\mathit{G}}^{T-1}\left(m\right)\mathit{v}\left(m\right)\right\}}{\partial m}\hfill \\ \\ =\underset{T\to \infty}{\mathrm{lim}}\underset{m\to 0}{\mathrm{lim}}\frac{1}{T}\left\{(T-1){\mathbf{1}}^{\u2020}{\mathit{G}}^{T-2}\left(m\right)\frac{d\mathit{G}\left(m\right)}{dm}\mathit{v}\left(m\right)+{\mathbf{1}}^{\u2020}{\mathit{G}}^{T-1}\left(m\right)\frac{d\mathit{v}\left(m\right)}{dm}\right\}\hfill \\ \\ =\underset{T\to \infty}{\mathrm{lim}}\frac{1}{T}\left\{(T-1){\mathbf{1}}^{\u2020}{\mathit{G}}^{T-2}\left(0\right)\left(\underset{m\to 0}{\mathrm{lim}}\frac{d\mathit{G}\left(m\right)}{dm}\right)\mathit{v}\left(0\right)+{\mathbf{1}}^{\u2020}{\mathit{G}}^{T-1}\left(0\right)\left(\underset{m\to 0}{\mathrm{lim}}\frac{d\mathit{v}\left(m\right)}{dm}\right)\right\}.\hfill \end{array}$$

$$\begin{array}{cc}\hfill I(X,S)& =\underset{T\to \infty}{\mathrm{lim}}\frac{1}{T}\left\{(T-1){\mathbf{1}}^{\u2020}{\mathit{G}}^{T-2}\left(0\right)\left(\underset{m\to 0}{\mathrm{lim}}\frac{d\mathit{G}\left(m\right)}{dm}\right)\mathit{v}\left(0\right)+{\mathbf{1}}^{\u2020}{\mathit{G}}^{T-1}\left(0\right)\left(\underset{m\to 0}{\mathrm{lim}}\frac{d\mathit{v}\left(m\right)}{dm}\right)\right\}\hfill \\ & =\underset{T\to \infty}{\mathrm{lim}}\left\{\left(1-\frac{1}{T}\right){\mathbf{1}}^{\u2020}\left(\underset{m\to 0}{\mathrm{lim}}\frac{d\mathit{G}\left(m\right)}{dm}\right){\mathsf{\pi}}_{z}+\frac{1}{T}{\mathbf{1}}^{\u2020}\left(\underset{m\to 0}{\mathrm{lim}}\frac{d\mathit{v}\left(m\right)}{dm}\right)\right\}\hfill \\ & ={\mathbf{1}}^{\u2020}\left(\underset{m\to 0}{\mathrm{lim}}\frac{d\mathit{G}\left(m\right)}{dm}\right){\mathsf{\pi}}_{z}\hfill \\ & =\sum _{(x,s),({x}^{\prime},{s}^{\prime})}{\mathsf{\pi}}_{z}({x}^{\prime},{s}^{\prime}){q}_{z}(x,s|{x}^{\prime},{s}^{\prime})\mathrm{log}\frac{{q}_{z}(x,s|{x}^{\prime},{s}^{\prime})}{{q}_{x}\left(x\right|{x}^{\prime}){q}_{s}\left(s\right|{s}^{\prime})}.\hfill \end{array}$$

## Appendix B

#### Appendix B.1 Discussions on the Generality of Mutual Information Rate Decomposition and Connections to Entropy Production in Terms of Equations (14), (17), and (18)

For general cases, indeed, we do not expect that both X and S are Markovian. Even the joint chain Z may be non-Markovian. This means that Equation (2) may fail to depict the dynamics of Z. Then, the landscape-flux decomposition needs to be generalized to this situation. Such decomposition was not developed yet for the non-Markovian cases. This will be discussed in a separate work. However, when Z is a stationary and ergodic process (also assume that both X and S are stationary and ergodic), we show that the MIR can be decomposed into two parts as is shown in Equation (14), and an interesting relation between the MIR and EPRs can still be found in the same form of the last expression in Equation (17).

We are interested in the correlation between the forward sequences of X and S, which can be measured by $\mathrm{log}\frac{P\left({Z}^{T}\right)}{P\left({X}^{T}\right)P\left({S}^{T}\right)}$ (${Z}^{T}=({X}^{T},{S}^{T})$), then the MIR can be used to quantify the average rate of this correlation in the long time limit as shown in Equation (12). Furthermore, we are interested in the averaged difference between the rate of the correlation of the backward processes and that of the forward processes. This comes to the time-irreversible part of the MIR defined by:
where $\mathrm{log}\frac{P\left({\tilde{Z}}^{T}\right)}{P\left({\tilde{X}}^{T}\right)P\left({\tilde{S}}^{T}\right)}$ quantifies the correlation between the backward sequences of X and S. Clearly, the time-irreversible part of MIR depicting the correlation of the forward processes of X and S is enhanced (${I}_{B}(X,S)>0$) or weakened (${I}_{B}(X,S)<0$) compared to that of the backward processes. The other important part of the MIR, namely the time-reversible part, shows that the averaged rate of the correlation that remains in both forward and backward processes,
Consequentially, the MIR $I(X,S)$ is decomposed into two parts shown as $I(X,S)={I}_{D}(X,S)+{I}_{B}(X,S)$. In Markovian cases, each part of the MIR reduces to the form in Equation (14) respectively.

$$\begin{array}{c}\hfill {I}_{B}(X,S)=\underset{T\to \infty}{\mathrm{lim}}\frac{1}{2T}{\u2329\mathrm{log}\frac{P({Z}^{T})}{P({X}^{T})P({S}^{T})}-\mathrm{log}\frac{P({\tilde{Z}}^{T})}{P({\tilde{X}}^{T})P({\tilde{S}}^{T})}\u232a}_{{Z}^{T}},\end{array}$$

$$\begin{array}{c}\hfill {I}_{D}(X,S)=\underset{T\to \infty}{\mathrm{lim}}\frac{1}{2T}{\u2329\mathrm{log}\frac{P({Z}^{T})}{P({X}^{T})P({S}^{T})}+\mathrm{log}\frac{P({\tilde{Z}}^{T})}{P({\tilde{X}}^{T})P({\tilde{S}}^{T})}\u232a}_{{Z}^{T}},\end{array}$$

The relation between the time-irreversible part of the MIR and EPRs can be shown as follows,
which is in the same form as Equation (17). Additionally, due to the non-negativity of the EPRs, the inequalities in (18) still hold for general cases.

$$\begin{array}{cc}\hfill {I}_{B}(X,S)& =\underset{T\to \infty}{\mathrm{lim}}\frac{1}{2T}{\u2329\mathrm{log}\frac{P({Z}^{T})}{P({X}^{T})P({S}^{T})}-\mathrm{log}\frac{P({\tilde{Z}}^{T})}{P({\tilde{X}}^{T})P({\tilde{S}}^{T})}\u232a}_{{Z}^{T}}\hfill \\ & =\underset{T\to \infty}{\mathrm{lim}}\frac{1}{2T}\left\{{\u2329\mathrm{log}\frac{P({Z}^{T})}{P({\tilde{Z}}^{T})}\u232a}_{{Z}^{T}}-{\u2329\mathrm{log}\frac{P({X}^{T})}{P({\tilde{X}}^{T})}\u232a}_{{X}^{T}}-{\u2329\mathrm{log}\frac{P({S}^{T})}{P({\tilde{S}}^{T})}\u232a}_{{S}^{T}}\right\}\hfill \\ & =\frac{1}{2}\left({R}_{z}-{R}_{x}-{R}_{s}\right),\hfill \end{array}$$

#### Appendix B.2. The Smart Demon

To verify the conclusions in more general cases, we constructed a model of a smart demon, which reflects a more general situation in the nature: the two information subsystems play feedback to each other. A three-state information system is connected to two information baths labeled a and b, respectively. The states of the system are denoted by $\mathcal{X}=\{x:x=0,1,2\}$, respectively. Each bath sends an instruction to the system. If the system adopts one of them, it then follows the instruction and makes a change of the state. The instructions generated from one arbitrary bath are independent and identically distributed. The probability distributions of the instructions corresponding to the baths read $\{{\u03f5}_{s}\left(x\right):{\u03f5}_{s}\left(x\right)\ge 0,{\sum}_{x\in \mathcal{X}}{\u03f5}_{s}\left(x\right)=1\}$ (for $s=a,b$), respectively. Since the system cannot execute the two incoming instructions simultaneously, there exists an information demon making choices for the system. The choices of the demon are denoted by the labels of the baths $\mathcal{S}=\{s:s=a,b\}$, respectively. The demon observes the state of the system and plays feedback. The (conditional) probability distribution of the demon’s choices reads $\{d\left(s\right|{x}^{\prime},{s}^{\prime}):d\left(s\right|{x}^{\prime},{s}^{\prime})\phantom{\rule{3.33333pt}{0ex}}\ge \phantom{\rule{3.33333pt}{0ex}}0,{\sum}_{s\in \mathcal{S}}d\left(s\right|{x}^{\prime},{s}^{\prime})\phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}1,{x}^{\prime}\in \mathcal{X},{s}^{\prime}\in \mathcal{S}\}$. Still, we use X, S and $Z=(X,S)$ to denote the processes of the system, the demon and the corresponding joint chain, a BMC, respectively.

The transition probabilities of the BMC read:
where ${\u03f5}_{s}\left(x\right)$ denotes the probability of the instruction x from bath $s=a,b$. We assume that there is a unique stationary distribution of z, ${\mathsf{\pi}}_{z}$ such that:
The stationary distributions of S and X then read:
The behavior of the demon can be seen as a Markovian process in the steady state. The corresponding transition probabilities of the system read:
It can be verified that ${\mathsf{\pi}}_{s}$ is the unique stationary distribution of S. However, the dynamics of the system always behaves as a non-Markovian process in general.

$$\begin{array}{c}\hfill {q}_{z}\left(z\right|{z}^{\prime})={q}_{z}(x,s|{x}^{\prime},{s}^{\prime})=d\left(s\right|{x}^{\prime},{s}^{\prime}){\u03f5}_{s}\left(x\right),\end{array}$$

$$\begin{array}{c}\hfill {\mathsf{\pi}}_{z}\left(z\right)=\sum _{{z}^{\prime}}{q}_{z}\left(z\right|{z}^{\prime}){\mathsf{\pi}}_{z}\left({z}^{\prime}\right).\end{array}$$

$$\begin{array}{c}\hfill \left\{\begin{array}{c}{\mathsf{\pi}}_{s}\left(s\right)={\sum}_{x}{\mathsf{\pi}}_{z}(x,s),\hfill \\ {\mathsf{\pi}}_{x}\left(x\right)={\sum}_{s}{\mathsf{\pi}}_{z}(x,s).\hfill \end{array}\right.\end{array}$$

$$\begin{array}{c}\hfill {q}_{s}\left(s\right|{s}^{\prime})=\frac{1}{{\mathsf{\pi}}_{s}\left({s}^{\prime}\right)}\sum _{{x}^{\prime}}d\left(s\right|{x}^{\prime},{s}^{\prime}){\mathsf{\pi}}_{z}({x}^{\prime},{s}^{\prime}).\end{array}$$

To characterize the time-irreversibility of Z, X and S, we use the definition of EPR in Equation (15) and have:
where:
To quantify the correlation between the system and demon, we use the definition of MIR in Equation (12).

$$\begin{array}{c}\hfill \left\{\begin{array}{c}{R}_{z}=\frac{1}{2}{\sum}_{z,{z}^{\prime}}{J}_{z}({z}^{\prime}\to z)\mathrm{log}\frac{{q}_{z}\left(z\right|{z}^{\prime})}{{q}_{z}\left({z}^{\prime}\right|z)},\hfill \\ {R}_{s}=\frac{1}{2}{\sum}_{s,{s}^{\prime}}{J}_{s}({s}^{\prime}\to s)\mathrm{log}\frac{{q}_{s}\left(s\right|{s}^{\prime})}{{q}_{s}\left({s}^{\prime}\right|s)}=0,\hfill \\ {R}_{x}={\mathrm{lim}}_{T\to \infty}\frac{1}{T}{\sum}_{{X}^{T}}P\left({X}^{T}\right)\mathrm{log}\frac{P\left({X}^{T}\right)}{P\left({\tilde{X}}^{T}\right)},\hfill \end{array}\right.\end{array}$$

$$\begin{array}{c}\hfill P\left({X}^{T}\right)=\sum _{{S}^{T}}P({Z}^{T}=({X}^{T},{S}^{T})).\end{array}$$

We are also interested in the time-irreversible part of MIR, ${I}_{B}(X,S)$, which influences the EPR of the system, ${R}_{x}$. This can be seen from Equation (A14) such that:

$$\begin{array}{c}\hfill {R}_{x}={R}_{z}-{R}_{s}-2{I}_{B}(X,S).\end{array}$$

We use numerical simulations, which evaluate ${R}_{x}$, $I(X,S)$ and ${I}_{B}(X,S)$ directly from the typical sequences of Z (see [7,8]). The corresponding results can be given by:
where ${Z}^{T}=({X}^{T},{S}^{T})$ is a typical sequence of Z (hence, ${X}^{T}$ and ${S}^{T}$ are typical sequences of X and S, respectively). The convergence of this numerical simulation can be observed as T increases. To confirm the result ${R}_{x}={R}_{z}-{R}_{s}-2{I}_{B}(X,S)$, we use different typical sequences in calculating ${R}_{x}$ and ${I}_{B}(X,S)$, respectively. ${R}_{z}$ and ${R}_{s}$ are calculated by using the corresponding analytical results shown above.

$$\begin{array}{c}\hfill \left\{\begin{array}{c}{R}_{x}\approx \frac{1}{T}\mathrm{log}\frac{P\left({X}^{T}\right)}{P\left({\tilde{X}}^{T}\right)},\phantom{\rule{4.pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}\mathrm{large}\phantom{\rule{4.pt}{0ex}}T,\hfill \\ I(X,S)\approx \frac{1}{T}\mathrm{log}\frac{P\left({Z}^{T}\right)}{P\left({X}^{T}\right)P\left({S}^{T}\right)},\phantom{\rule{4.pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}\mathrm{large}\phantom{\rule{4.pt}{0ex}}T,\hfill \\ {I}_{B}(X,S)\approx \frac{1}{2T}\mathrm{log}\frac{P\left({Z}^{T}\right)}{P\left({X}^{T}\right)P\left({S}^{T}\right)}-\frac{1}{2T}\mathrm{log}\frac{P\left({\tilde{Z}}^{T}\right)}{P\left({\tilde{X}}^{T}\right)P\left({\tilde{S}}^{T}\right)},\phantom{\rule{4.pt}{0ex}}\mathrm{for}\phantom{\rule{4.pt}{0ex}}\mathrm{large}\phantom{\rule{4.pt}{0ex}}T,\hfill \end{array}\right.\end{array}$$

For numerical simulations, we randomly choose two groups of the parameters: the probabilities of the instructions of the baths ${\u03f5}_{a}$ and ${\u03f5}_{b}$ and the probabilities of the demon’s choices d (see Table A1 and Table A2). We evaluate ${R}_{x}$, $I(X,S)$ and ${I}_{B}(X,S)$ for both groups. The values of the numerical results are listed in Table A3.

$\{{\u03f5}_{a}(x=0),{\u03f5}_{a}(x=1),{\u03f5}_{a}(x=2)\}$ | $\{{\u03f5}_{b}(x=0),{\u03f5}_{b}(x=1),{\u03f5}_{b}(x=2)\}$ | |

Group 1 | $\{0.2344,0.2730,0.4926\}$ | $\{0.4217,0.4094,0.1689\}$ |

Group 2 | $\{0.1305,0.3972,0.4723\}$ | $\{0.3358,0.0010,0.6633\}$ |

$\left\{d\right(s=a|x=0,s=a),d(s=b|x=0,s=a\left)\right\}$ | $\left\{d\right(s=a|x=1,s=b),d(s=b|x=0,s=b\left)\right\}$ | |

Group 1 | $\{0.3844,0.6156\}$ | $\{0.6811,0.3189\}$ |

Group 2 | $\{0.1072,0.8928\}$ | $\{0.7473,0.2527\}$ |

$\left\{d\right(s=a|x=1,s=a),d(s=b|x=1,s=a\left)\right\}$ | $\left\{d\right(s=a|x=1,s=b),d(s=b|x=1,s=b\left)\right\}$ | |

Group 1 | $\{0.5195,0.4805\}$ | $\{0.8088,0.1912\}$ |

Group 2 | $\{0.6595,0.3405\}$ | $\{0.1600,0.8400\}$ |

$\left\{d\right(s=a|x=2,s=a),d(s=b|x=2,s=a\left)\right\}$ | $\left\{d\right(s=a|x=2,s=b),d(s=b|x=2,s=b\left)\right\}$ | |

Group 1 | $\{0.3775,0.6225\}$ | $\{0.3340,0.6660\}$ |

Group 2 | $\{0.0232,0.9768\}$ | $\{0.0814,0.9186\}$ |

${\mathbf{R}}_{\mathbf{z}}$ | ${\mathbf{R}}_{\mathbf{x}}$ | $\mathbf{I}\mathbf{(}\mathbf{X}\mathbf{,}\mathbf{S}\mathbf{)}$ | ${\mathbf{I}}_{\mathbf{B}}\mathbf{\left(}\mathbf{X}\mathbf{,}\mathbf{S}\mathbf{\right)}$ | |
---|---|---|---|---|

Group 1 | $0.0645$ | $0.0018$ | $0.0885$ | $0.0313$ |

Group 2 | $0.5485$ | $0.1291$ | $0.3385$ | $0.2097$ |

## References

- Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J.
**1948**, 27, 379–423. [Google Scholar] [CrossRef] - Ball, F.; Yeo, G.F. Lumpability and Marginalisability for Continuous-Time Markov Chains. J. Appl. Probab.
**1993**, 30, 518–528. [Google Scholar] [CrossRef] - Wei, W.; Wang, B.; Towsley, D. Continuous-time hidden Markov models for network performance evaluation. Perform. Eval.
**2002**, 49, 129–146. [Google Scholar] [CrossRef] - Strasberg, P.; Schaller, G.; Brandes, T.; Esposito, M. Thermodynamics of a physical model implementing a Maxwell demon. Phys. Rev. Lett.
**2013**, 110, 040601. [Google Scholar] [CrossRef] [PubMed] - Koski, J.V.; Kutvonen, A.; Khaymovich, I.M.; Ala-Nissila, T.; Pekola, J.P. On-Chip Maxwell’s Demon as an Information-Powered Refrigerator. Phys. Rev. Lett.
**2015**, 115, 260602. [Google Scholar] [CrossRef] [PubMed] - Mcgrath, T.; Jones, N.S.; Ten Wolde, P.R.; Ouldridge, T.E. Biochemical Machines for the Interconversion of Mutual Information and Work. Phys. Rev. Lett.
**2017**, 118, 028101. [Google Scholar] [CrossRef] [PubMed] - Mark, B.L.; Ephraim, Y. An EM algorithm for continuous-time bivariate Markov chains. Comput. Stat. Data Anal.
**2013**, 57, 504–517. [Google Scholar] [CrossRef] - Ephraim, Y.; Mark, B.L. Bivariate Markov Processes and Their Estimation. Found. Trends Signal Process.
**2012**, 6, 1–95. [Google Scholar] [CrossRef] - Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2006; ISBN 13-978-0-471-24195-9. [Google Scholar]
- Parrondo, J.M.R.; Horowitz, J.M.; Sagawa, T. Thermodynamics of information. Nat. Phys.
**2015**, 11, 131–139. [Google Scholar] [CrossRef] - Sagawa, T.; Ueda, M. Fluctuation theorem with information exchange: Role of correlations in stochastic thermodynamics. Phys. Rev. Lett.
**2012**, 109, 180602. [Google Scholar] [CrossRef] [PubMed] - Horowitz, J.M.; Esposito, M. Thermodynamics with Continuous Information Flow. Phys. Rev. X
**2014**, 4, 031015. [Google Scholar] [CrossRef] - Barato, A.C.; Hartich, D.; Seifert, U. Rate of Mutual Information Between Coarse-Grained Non-Markovian Variables. J. Stat. Phys.
**2013**, 153, 460–478. [Google Scholar] [CrossRef] - Wang, J.; Xu, L.; Wang, E.K. Potential landscape and flux framework of nonequilibrium networks: Robustness, dissipation, and coherence of biochemical oscillations. Proc. Natl. Acad. Sci. USA
**2008**, 105, 12271–12276. [Google Scholar] [CrossRef] [PubMed] - Wang, J. Landscape and flux theory of non-equilibrium dynamical systems with application to biology. Adv. Phys.
**2015**, 64, 1–137. [Google Scholar] [CrossRef] - Li, C.H.; Wang, E.K.; Wang, J. Potential flux landscapes determine the global stability of a Lorenz chaotic attractor under intrinsic fluctuations. J. Chem. Phys.
**2012**, 136, 194108. [Google Scholar] [CrossRef] [PubMed] - Schnakenberg, J. Network theory of microscopic and macroscopic behavior of master equation systems. Rev. Mod. Phys.
**1976**, 48, 571–585. [Google Scholar] [CrossRef] - Zia, R.K.P.; Schmittmann, B. Probability currents as principal characteristics in the statistical mechanics of non-equilibrium steady states. J. Stat. Mech.-Theory E
**2007**, 2007. [Google Scholar] [CrossRef] - Maes, C.; Netočný, K. Canonical structure of dynamical fluctuations in mesoscopic nonequilibrium steady states. Europhys. Lett.
**2008**, 82. [Google Scholar] [CrossRef] - Qian, M.P.; Qian, M. Circulation for recurrent markov chains. Probab. Theory Relat.
**1982**, 59, 203–210. [Google Scholar] - Zhang, Z.D.; Wang, J. Curl flux, coherence, and population landscape of molecular systems: Nonequilibrium quantum steady state, energy (charge) transport, and thermodynamics. J. Chem. Phys.
**2014**, 140, 245101. [Google Scholar] [CrossRef] [PubMed] - Zhang, Z.D.; Wang, J. Landscape, kinetics, paths and statistics of curl flux, coherence, entanglement and energy transfer in non-equilibrium quantum systems. New J. Phys.
**2015**, 17, 043053. [Google Scholar] [CrossRef] - Luo, X.S.; Xu, L.F.; Han, B.; Wang, J. Funneled potential and flux landscapes dictate the stabilities of both the states and the flow: Fission yeast cell cycle. PLoS Comput. Biol.
**2017**, 13, e1005710. [Google Scholar] [CrossRef] [PubMed] - Gray, R.; Kieffer, J. Mutual information rate, distortion, and quantization in metric spaces. IEEE Trans. Inf. Theory
**1980**, 26, 412–422. [Google Scholar] [CrossRef] - Maes, C.; Redig, F.; van Moffaert, A. On the definition of entropy production, via examples. J. Math. Phys.
**2000**, 41, 1528–1554. [Google Scholar] [CrossRef] - Gaspard, P. Time-reversed dynamical entropy and irreversibility in Markovian random processes. J. Stat. Phys.
**2004**, 117, 599–615. [Google Scholar] [CrossRef] - Feng, H.D.; Wang, J. Potential and flux decomposition for dynamical systems and non-equilibrium thermodynamics: Curvature, gauge field, and generalized fluctuation-dissipation theorem. J. Chem. Phys.
**2011**, 135, 234511. [Google Scholar] [CrossRef] [PubMed] - Polettini, M. Nonequilibrium thermodynamics as a gauge theory. Europhys. Lett.
**2012**, 97, 30003. [Google Scholar] [CrossRef]

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).