Fluctuation Theorem of Information Exchange within an Ensemble of Paths Conditioned on Correlated-Microstates

Fluctuation theorems are a class of equalities that express universal properties of the probability distribution of a fluctuating path functional such as heat, work or entropy production over an ensemble of trajectories during a non-equilibrium process with a well-defined initial distribution. Jinwoo and Tanaka (Jinwoo, L.; Tanaka, H. Sci. Rep. 2015, 5, 7832) have shown that work fluctuation theorems hold even within an ensemble of paths to each state, making it clear that entropy and free energy of each microstate encode heat and work, respectively, within the conditioned set. Here we show that information that is characterized by the point-wise mutual information for each correlated state between two subsystems in a heat bath encodes the entropy production of the subsystems and heat bath during a coupling process. To this end, we extend the fluctuation theorem of information exchange (Sagawa, T.; Ueda, M. Phys. Rev. Lett. 2012, 109, 180602) by showing that the fluctuation theorem holds even within an ensemble of paths that reach a correlated state during dynamic co-evolution of two subsystems.


Introduction
Thermal fluctuations play an important role in the functioning of molecular machines: fluctuations mediate the exchange of energy between molecules and the environment, enabling molecules to overcome free energy barriers and to stabilize in low free energy regions. They make positions and velocities random variables, and thus make path functionals such as heat and work fluctuating quantities. In the past two decades, a class of relations called fluctuation theorems have shown that there are universal laws that regulate fluctuating quantities during a process that drives a system far from equilibrium. The Jarzynski equality, for example, links work to the change of equilibrium free energy [1], and the Crooks fluctuation theorem relates the probability of work to the dissipation of work [2] if we mention a few. There are many variations on these basic relations. Seifert has extended the second-law to the level of individual trajectories [3], and Hatano and Sasa have considered transitions between steady states [4]. Experiments on single molecular levels have verified the fluctuation theorems, providing critical insights on the behavior of bio-molecules [5][6][7][8][9][10][11][12][13].
Information is an essential subtopic of fluctuation theorems [14][15][16]. Beginning with pioneering studies on feedback controlled systems [17,18], unifying formulations of information thermodynamics have been established [19][20][21][22][23]. Especially, Sagawa and Ueda have introduced information to the realm of fluctuation theorems [24]. They have established a fluctuation theorem of information exchange, unifying non-equilibrium processes of measurement and feedback control [25]. They have considered a situation where a system, say X, evolves in such a manner that depends on state y of another system Y the state of which is fixed during the evolution of the state of X. In this setup, they have shown that establishing a correlation between the two subsystems accompanies an entropy production. Very recently, we have released the constraint that Sagawa and Ueda have assumed, and proved that the same form of the fluctuation theorem of information exchange holds even when both subsystems X and Y co-evolve in time [26].
In the context of fluctuation theorems, external control λ t defines a process by varying the parameter in a predetermined manner during 0 ≤ t ≤ τ. One repeats the process according to initial probability distribution P 0 , and then, a system generates as a response an ensemble of microscopic trajectories {x t }. Jinwoo and Tanaka [27,28] have shown that the Jarzynski equality and the Crooks fluctuation theorem hold even within an ensemble of trajectories conditioned on a fixed microstate at final time τ, where the local form of non-equilibrium free energy replaces the role of equilibrium free energy in the equations, making it clear that free energy of microstate x τ encodes the amount of supplied work for reaching x τ during processes λ t . Here local means that a term is related to microstate x at time τ considered as an ensemble.
In this paper, we apply this conceptual framework of considering a single microstate as an ensemble of trajectories to the fluctuation theorem of information exchange (see Figure 1a). We show that mutual information of a correlated-microstates encodes the amount of entropy production within the ensemble of paths that reach the correlated-states. This local version of the fluctuation theorem of information exchange provides much more detailed information for each correlated-microstates compared to the results in [25,26]. In the existing approaches that consider the ensemble of all paths, each point-wise mutual information does not provide specific details on a correlated-microstates, but in this new approach of focusing on a subset of the ensemble, local mutual information provides detailed knowledge on particular correlated-states.
We organize the paper as follows: In Section 2, we briefly review some fluctuation theorems that we have mentioned. In Section 3, we prove the main theorem and its corollary. In Section 4, we provide illustrative examples, and in Section 5, we discuss the implication of the results.  Figure 1. Ensemble of conditioned paths and dynamic information exchange: (a) Γ and Γ x τ ,y τ denote respectively the set of all trajectories during process λ t for 0 ≤ t ≤ τ and that of paths that reach (x τ , y τ ) at time τ. Red curves schematically represent some members of Γ x τ ,y τ . (b) We magnified a single trajectory in the left panel to represent a detailed view of dynamic coupling of (x τ , y τ ) during process λ t . The point-wise mutual information I t (x t , y t ) may vary not necessarily monotonically.

Conditioned Nonequilibrium Work Relations and Sagawa-Ueda Fluctuation Theorem
We consider a system in contact with a heat bath of inverse temperate β := 1/(k B T) where k B is the Boltzmann constant, and T is the temperature of the heat bath. External parameter λ t drives the system away from equilibrium during 0 ≤ t ≤ τ. We assume that the initial probability distribution is equilibrium one at control parameter λ 0 . Let Γ be the set of all microscopic trajectories, and Γ x τ be that of paths conditioned on x τ at time τ. Then, the Jarzynski equality [1] and end-point conditioned version [27,28] of it read as follows: respectively, where brackets · Γ indicates the average over all trajectories in Γ and · Γ xτ indicates the average over trajectories reaching x τ at time τ. Here W indicates work done on the system through λ t , F eq (λ t ) is equilibrium free energy at control parameter λ t , and F (x τ , τ) is local non-equilibrium free energy of x τ at time τ. Work measurement over a specific ensemble of paths gives us equilibrium free energy as a function of λ τ through Equation (1) and local non-equilibrium free energy as a micro-state function of x τ at time τ through Equation (2). The following fluctuation theorem links Equations (1) and (2): where brackets · x τ indicates the average over all microstates x τ at time τ [27,28]. Defining the reverse process by λ t := λ τ−t for 0 ≤ t ≤ τ, the Crooks fluctuation theorem [2] and end-point conditioned version [27,28] of it read as follows: respectively, where P Γ (W) and P Γ xτ (W) are probability distributions of work W normalized over all paths in Γ and Γ x τ , respectively. Here P indicates corresponding probabilities for the reverse process. For Equation (4), the initial probability distribution of the reverse process is an equilibrium one at control parameter λ τ . On the other hand, for Equation (5), the initiail probability distribution for the reverse process should be non-equilibrium probability distribution p(x τ , τ) of the forward process at control parameter , the difference in equilibrium free energy between λ 0 and λ τ , through Equation (4) [9]. Similar identification may provide . Now we turn to the Sagawa-Ueda fluctuation theorem of information exchange [25]. Specifically, we discuss the generalized version [26] of it. To this end, we consider two subsystems X and Y in the heat bath of inverse temperature β. During process λ t , they interact and co-evolve with each other. Then, the fluctuation theorem of information exchange reads as follows: where brackets indicate the ensemble average over all paths of the combined subsystems, and σ is the sum of entropy production of system X, system Y, and the heat bath, and ∆I is the change in mutual information between X and Y. We note that in the original version of the Sagawa-Ueda fluctuation theorem, only system X is in contact with the heat bath and Y does not evolve during the process [25,26].
In this paper, we prove an end-point conditioned version of Equation (6): where brackets indicate the ensemble average over all paths to x τ and y τ at time τ, and I t (0 ≤ t ≤ τ) is local form of mutual information between microstates of X and Y at time t (see Figure 1b). If there is no initial correlation, i.e., I 0 = 0, Equation (7) clearly indicates that local mutual information I τ as a function of correlated-microstates (x τ , y τ ) encodes entropy production σ within the end-point conditioned ensemble of paths. In the same vein, we may interpret initial correlation I 0 as encoded entropy production for the preparation of the initial condition.

Theoretical Framework
Let X and Y be finite classical stochastic systems in the heat bath of inverse temperate β. We allowed external parameter λ t drives one or both subsystems away from equilibrium during time 0 ≤ t ≤ τ [29][30][31]. We assumed that classical stochastic dynamics describes the time evolution of X and Y during process λ t along trajectories {x t } and {y t }, respectively, where x t (y t ) denotes a specific microstate of X (Y) at time t for 0 ≤ t ≤ τ on each trajectory. Since trajectories fluctuate, we repeated process λ t with initial joint probability distribution P 0 (x, y) over all microstates (x, y) of systems X and Y. Then the subsystems may generate a joint probability distribution P t (x, y) for 0 ≤ t ≤ τ. Let P t (x) := P t (x, y) dy and P t (y) := P t (x, y) dx be the corresponding marginal probability distributions. We assumed so that we have P t (x, y) = 0, P t (x) = 0, and P t (y) = 0 for all x and y during 0 ≤ t ≤ τ. Now we consider entropy production σ of system X along {x t }, system Y along {y t }, and heat bath Q b during process λ t for 0 ≤ t ≤ τ as follows where ∆s := ∆s x + ∆s y , We remark that Equation (10) is different from the change of stochastic entropy of combined super-system composed of X and Y, which reads ln P 0 (x 0 , y 0 ) − ln P τ (x τ , y τ ) that reduces to Equation (10) if processes {x t } and {y t } are independent. The discrepancy leaves room for correlation Equation (11) below [25]. Here the stochastic entropy s[P t (•)] := − ln P t (•) of microstate • at time t is uncertainty of • at time t: the more uncertain that microstate • occurs, the greater the stochastic entropy of • is. We also note that in [25], system X was in contact with the heat reservoir, but system Y was not. Nor did system Y evolve. Thus their entropy production reads σ su := ∆s x + βQ b . Now we assume, during process λ t , that system X exchanged information with system Y. By this, we mean that trajectory {x t } of system X evolved depending on the trajectory {y t } of system Y (see Figure 1b). Then, the local form of mutual information I t at time t between x t and y t is the reduction of uncertainty of x t due to given y t [25]: where P t (x t |y t ) is the conditional probability distribution of x t given y t . The more information was being shared between x t and y t for their occurrence, the larger the value of I t (x t , y t ) was. We note that if x t and y t were independent at time t, I t (x t , y t ) became zero. The average of I t (x t , y t ) with respect to P t (x t , y t ) over all microstates is the mutual information between the two subsystems, which was greater than or equal to zero [32].

Proof of Fluctuation Theorem of Information Exchange Conditioned on a Correlated-Microstates
Now we are ready to prove the fluctuation theorem of information exchange conditioned on a correlated-microstates. We define reverse process λ t := λ τ−t for 0 ≤ t ≤ τ, where the external parameter is time-reversed [33,34]. The initial probability distribution P 0 (x, y) for the reverse process should be the final probability distribution for the forward process P τ (x, y) so that we have P 0 (x) = P 0 (x, y) dy = P τ (x, y) dy = P τ (x), P 0 (y) = P 0 (x, y) dx = P τ (x, y) dx = P τ (y). (12) Then, by Equation (8), we have P t (x, y) = 0, P t (x) = 0, and P t (y) = 0 for all x and y during 0 ≤ t ≤ τ. For each trajectories {x t } and {y t } for 0 ≤ t ≤ τ, we define the time-reversed conjugate as follows: where * denotes momentum reversal. Let Γ be the set of all trajectories {x t } and {y t }, and Γ x τ ,y τ be that of trajectories conditioned on correlated-microstates (x τ , y τ ) at time τ. Due to time-reversal symmetry of the underlying microscopic dynamics, the set Γ of all time-reversed trajectories was identical to Γ, and the set Γ x 0 ,y 0 of time-reversed trajectories conditioned on x 0 and y 0 was identical to Γ x τ ,y τ . Thus we may use the same notation for both forward and backward pairs. We note that the path probabilities P Γ and P Γ xτ ,yτ were normalized over all paths in Γ and Γ x τ ,y τ , respectively (see Figure 1a). With this notation, the microscopic reversibility condition that enables us to connect the probability of forward and reverse paths to dissipated heat reads as follows [2,[35][36][37]: where P Γ ({x t }, {y t }|x 0 , y 0 ) is the conditional joint probability distribution of paths {x t } and {y t } conditioned on initial microstates x 0 and y 0 , and P Γ ({x t }, {y t }|x 0 , y 0 ) is that for the reverse process. Now we restrict our attention to those paths that are in Γ x τ ,y τ , and divide both numerator and denominator of the left-hand side of Equation (14) by P τ (x τ , y τ ). Since P τ (x τ , y τ ) is identical to P 0 (x 0 , y 0 ), Equation (14) becomes as follows: since the probability of paths is now normalized over Γ x τ ,y τ . Then we have the following: To obtain Equation (17) from Equation (16), we multiply Equation (16) by P 0 (x 0 )P 0 (y 0 ) P 0 (x 0 )P 0 (y 0 ) and P 0 (x 0 )P 0 (y 0 ) P 0 (x 0 )P 0 (y 0 ) , which are 1. We obtain Equation (18) by applying Equations (10)- (12) and (15) to Equation (17). Finally, we use Equation (9) to obtain Equation (19) from Equation (18). Now we multiply both sides of Equation (19) by e −I τ (x τ ,y τ ) and P Γ xτ ,yτ ({x t }, {y t }), and take integral over all paths in Γ x τ ,y τ to obtain the fluctuation theorem of information exchange conditioned on a correlated-microstates: Here we use the fact that e −I τ (x τ ,y τ ) is constant for all paths in Γ x τ ,y τ , probability distribution P Γ xτ ,yτ is normalized over all paths in Γ x τ ,y τ , and d{x t } = d{x t } and d{y t } = d{y t } due to the time-reversal symmetry [38]. Equation (20) clearly shows that just as local free energy encodes work [27], and local entropy encodes heat [28], the local form of mutual information between correlated-microstates (x τ , y τ ) encodes entropy production, within the ensemble of paths that reach each microstate. The following corollary provides more information on entropy production in terms of energetic costs.

Corollary
To discuss entropy production in terms of energetic costs, we define local free energy F x of x t and F y of y t at control parameter λ t as follows: where T is the temperature of the heat bath, k B is the Boltzmann constant, E x and E y are internal energy of systems X and Y, respectively, and s[P t (•)] := − ln P t (•) is stochastic entropy [2,3]. Work done on either one or both systems through process λ t is expressed by the first law of thermodynamics as follows: where ∆E is the change in internal energy of the total system composed of X and Y. If we assume that systems X and Y are weakly coupled, in that interaction energy between X and Y is negligible compared to the internal energy of X and Y, we may have where ∆E x := E x (x τ , τ) − E x (x 0 , 0) and ∆E y := E y (y τ , τ) − E y (y 0 , 0) [39]. We rewrite Equation (18) by adding and subtracting the change of internal energy ∆E x of X and ∆E y of Y as follows: where we have applied Equations (21)-(23) consecutively to Equation (24) to obtain Equation (25).
Here ∆F x := F x (x τ , τ) − F x (x 0 , 0) and ∆F y := F y (y τ , τ) − F y (y 0 , 0). Now we multiply both sides of Equation (25) by e −I τ (x τ ,y τ ) and P Γ xτ ,yτ ({x t }, {y t }), and take integral over all paths in Γ x τ ,y τ to obtain the following: which generalizes known relations in the literature [24,[39][40][41][42][43]. We note that Equation (26) holds under the weak-coupling assumption between systems X and Y during process λ t , and ∆F x + ∆F y in Equation (26) is the difference in non-equilibrium free energy, which is different from the change in equilibrium free energy that appears in similar relations in the literature [24,[40][41][42][43]. If there is no initial correlation, i.e., I 0 = 0, Equation (26) indicates that local mutual information I τ as a state function of correlated-microstates (x τ , y τ ) encodes entropy production, β(W − ∆F x − ∆F y ), within the ensemble of paths in Γ x τ ,y τ . In the same vein, we may interpret initial correlation I 0 as encoded entropy-production for the preparation of the initial condition.
In [25], they showed that the entropy of X can be decreased without any heat flow due to the negative mutual information change under the assumption that one of the two systems does not evolve in time. Equation (20) implies that the negative mutual information change can decrease the entropy of X and that of Y simultaneously without any heat flow by the following: provided Q b x τ ,y τ = 0. Here ∆I τ (x τ , y τ ) := I τ (x τ , y τ ) − I 0 (x 0 , y 0 ) x 0 ,y 0 . In terms of energetics, Equation (26) implies that the negative mutual information change can increase the free energy of X and that of Y simultaneously without any external-supply of energy by the following: provided W x τ ,y τ = 0.

A Simple One
Let X and Y be two systems that weakly interact with each other, and be in contact with the heat bath of inverse temperature β. We may think of X and Y, for example, as bio-molecules that interact with each other or X as a device which measures the state of other system and Y be a measured system. We consider a dynamic coupling process as follows: Initially, X and Y are separately in equilibrium such that the initial correlation I 0 (x 0 , y 0 ) is zero for all x 0 and y 0 . At time t = 0, system X starts (weak) interaction with system Y until time t = τ. During the coupling process, external parameter λ t for 0 ≤ t ≤ τ may exchange work with either one or both systems (see Figure 1b). Since each process fluctuates, we repeat the process many times to obtain probability distribution P t (x, y) for 0 ≤ t ≤ τ. We allow both systems co-evolve interactively and thus I t (x t , y t ) may vary not necessarily monotonically. Let us assume that the final probability distribution P τ (x τ , y τ ) is as shown in Table 1. Then, a few representative mutual information read as follows: By Jensen's inequality [32], Equation (20) implies Thus coupling x τ = 0, y τ = 0 accompanies on average entropy production of at least ln(3/2) which is greater than 0. Coupling x τ = 0, y τ = 1 may not produce entropy on average. Coupling x τ = 0, y τ = 2 on average may produce negative entropy by ln(1/2) = − ln 2. Three individual inequalities provide more detailed information than that from σ Γ ≥ I τ (x τ , y τ ) Γ ≈ 0.0872 currently available from [25,26].

A "Tape-Driven" Biochemical Machine
In [44], McGrath et al. proposed a physically realizable device that exploits or creates mutual information, depending on system parameters. The system is composed of an enzyme E in a chemical bath, interacting with a tape that is decorated with a set of pairs of molecules (see Figure 2a). A pair is composed of substrate molecule X (or phosphorylated X * ) and activator Y of the enzyme (or Y which denotes the absence of Y). The binding of molecule Y to E converts the enzyme into active mode E † , which catalyzes phosphate exchange between ATP and X: The tape is prepared in a correlated manner through a single parameter Ψ: If Ψ < 0.5, a pair of Y and X * is abundant so that the interaction of enzyme E with molecule Y activates the enzyme, causing the catalytic reaction of Equation (31) from the right to the left, resulting in the production of ATP from ADP. If the bath were prepared such that [ATP] > [ADP], the reaction corresponds to work on the chemical bath against the concentration gradient. Note that this interaction causes the conversion of X * to X, which reduces the initial correlation between X * and Y, resulting in the conversion of mutual information into work. If E interacts with a pair of Y and X which is also abundant for Ψ < 0.5, the enzyme becomes inactive due to the absence of Y, preventing the reaction Equation (31) from the left to the right, which plays as a ratchet that blocks the conversion of X and ATP to X * and ADP, which might happen otherwise due to the the concentration gradient of the bath. On the other hand, if Ψ > 0.5, a pair of Y and X is abundant which allows the enzyme to convert X into X * using the pressure of the chemical bath, creating the correlation between Y and X * . If E interacts with a pair of Y and X * which is also abundant for Ψ > 0.5, the enzyme is again inactive, preventing the de-phosphorylation of X * , keeping the created correlation. In this regime, the net effect is the conversion of work (due to the chemical gradient of the bath) to mutual information. The concentration of ATP and ADP in the chemical bath is adjusted via α ∈ (−1, 1) such that relative to a reference concentration C 0 . For the analysis of various regimes of different parameters, we refer the reader to [44].
In this example, we concentrate on the case with α = 0.99 and Ψ = 0.69, where Ref. [44] pays a special attention. They analyzed the dynamics of mutual information I t Γ during 10 −2 ≤ t ≤ 10 2 . Due to the high initial correlation, the enzyme converts the mutual information between X * and Y into work against the pressure of the chemical bath with [ATP] > [ADP]. As the reactions proceed, correlation I t Γ drops until the minimum reaches, which is zero. Then, eventually the reaction is inverted, and the bath begins with working to create mutual information between X * and Y as shown in Figure 2b.
We split the ensemble Γ t of paths into Γ t X,Y composed of trajectories reaching (X, Y) at each t and Γ t X * ,Y composed of those reaching (X * , Y) at time t. Then, we calculate I t (X, Y) and I t (X * , Y) using the analytic form of probability distributions that they derived. Figure 2c,d show I t (X, Y) and I t (X * , Y), respectively, as a function of time t. During the whole process, mutual information I t (X, Y) monotonically decreases. For 10 −2 ≤ t ≤ 10 1/3 , it keeps positive, and after that, it becomes negative which is possible for local mutual information. Trajectories in Γ X,Y harness mutual information between X * and Y, converting X * to X and ADP to ATP against the chemical bath. Contrary to this, I t (X * , Y) increases monotonically. It becomes positive after t > 10 1/3 , indicating that the members in Γ t X * ,Y create mutual information between X * and Y by converting X to X * using the excess of ATP in the chemical bath. The effect accumulates, and the negative values of I t (X * , Y) turn to the positive after t > 10 1/3 .

Conclusions
We have proved the fluctuation theorem of information exchange conditioned on correlated-microstates, Equation (20), and its corollary, Equation (26). Those theorems make it clear that local mutual information encodes as a state function of correlated-states entropy production within an ensemble of paths that reach the correlated-states. Equation (20) also reproduces lower bound of entropy production, Equation (30), within a subset of path-ensembles, which provides more detailed information than the fluctuation theorem involved in the ensemble of all paths. Equation (26) enables us to know the exact relationship between work, non-equilibrium free energy, and mutual information. This end-point conditioned version of the theorem also provides more detailed information on the energetics for coupling than current approaches in the literature. This robust framework may be useful to analyze thermodynamics of dynamic molecular information processes [44][45][46] and to analyze dynamic allosteric transitions [47,48].