Fitness Gain of Individually Sensed Information by Cells

Mutual information and its causal variant, directed information, have been widely used to quantitatively characterize the performance of biological sensing and information transduction. However, once coupled with selection in response to decision-making, the sensing signal could have more or less evolutionary value than its mutual or directed information. In this work, we show that an individually sensed signal always has a better fitness value, on average, than its mutual or directed information. The fitness gain, which satisfies fluctuation relations (FRs), is attributed to the selection of organisms in a population that obtain a better sensing signal by chance. A new quantity, similar to the coarse-grained entropy production in information thermodynamics, is introduced to quantify the total fitness gain from individual sensing, which also satisfies FRs. Using this quantity, the optimizing fitness gain of individual sensing is shown to be related to fidelity allocations for individual environmental histories. Our results are supplemented by numerical verifications of FRs, and a discussion on how this problem is linked to information encoding and decoding.


Introduction
Most biological systems are equipped with active sensing machinery to monitor the ever-changing environment. The fidelity of sensing is crucial to choosing appropriate states and behaviors in response to changes in environmental states [1][2][3]. Instantaneous mutual information, path-wise mutual information, and its causal variant, directed information, have been used to quantitatively characterize the performance of the sensing and information transduction, theoretically [4][5][6][7] and experimentally [8][9][10][11]. These information measures are also fundamental to the thermodynamic cost of sensing [12][13][14][15][16].
However, it is still elusive whether these measures can appropriately quantify the biological and fitness value of sensed information. Despite intensive works on the fitness value of information [17][18][19][20][21][22][23][24][25], almost all works considered a biologically unrealistic situation in which all cells or organisms in a population receive a common sensing signal, which is the requisite for proving that the fitness value of sensing is bounded by the information measures. Few studies have conjectured that biologically realistic sensing by individual organisms may have greater fitness value than these measures [22,25].
In this work, we resolve this problem by generally proving that the individual sensing always has greater fitness value than common sensing does. The additional fitness gain, which satisfies fluctuation relations (FRs), is attributed to the selection of organisms that obtains a correct sensing signal by chance.
A new quantity, which is similar to the coarse-grained entropy production in information thermodynamics, is introduced to quantify the total fitness gain from the individual sensing, the upper bound of which is strictly higher than the directed information. We further show that the optimization of this quantity is closely related to optimizing an auto-encoding network, in which sensing, phenotypic switching, and metabolic allocation work as encoding, processing, and decoding, respectively. Our general results, especially those for FRs, are verified by a numerical simulations.

Modeling Sensing and Adaptation Processes
We consider a population of an asexual organism that replicates with an instantaneous replication rate k(x, y), depending on its phenotype x ∈ S x and the state of environment y ∈ S y , where the phenotypic and environmental states are assumed to be discrete and finite, for simplicity. The organism switches its phenotype stochastically from x to x by exploiting sensing signal z ∈ S z with a transition probability T F (x |x, z) within a small time interval ∆t. Depending on the physical entity of z, the sensing can be categorized as either individual or common sensing [22,25]. In the case of individual sensing, z is the state of a sensing system of the organism, such as the activity of receptors. Due to the stochasticity in the sensing process, the individual organisms receive different sensing signals z (Figure 1a). By assuming that the stochastic sensing output z depends on the state of the environment y as T S (z |z, y ), we describe the dynamics of the number of organisms N Y t (x t , z t ) that have phenotypic state x t with sensing signal z t at t as where Y t := {y 0 , · · · , y t } is the history of the environmental state, the statistical properties of which are characterized by a path probability Q[Y t ]. In this representation, we implicitly assume a causal dependency among x t , y t , and z t as · · · → y t → z t → x t → y t+1 → z t+1 → x t+1 → · · · . Such representation has been conventionally used for notational simplicity.  The colors of cells and molecules on the cells represent phenotypic states and sensing signal, respectively. Bars on the diagrams indicate the histories of environmental states and common sensing. In (a), the sensing singal of each cell is correlated with the environmental state but has an intercellular variation due to the stochasticity of individual sensing. In (b), on the other hand, all the cells at certain time points share the same sensing signal, which is shown by the background colors in the diagram.
In contrast, in the case of common sensing, z is assumed to be partial information on the environmental state that is common to all organisms [26,27] (Figure 1b). An example is an extracellular chemical that correlates with the environmental state and can be sensed by the organisms with negligible error. The dynamics of the number of organisms N Y ,Z t (x t ) with phenotypic state x at time t under a realization of environmental and common signal histories, Y t and Z t , can be represented as We assume that the history of the common signal Z t := {z 0 , · · · , z t } follows a statistical law Q[Z t Y t ], which is causally conditional on the environmental history. At this stage, the statistical property of the common signal is abstractly represented as Q[Z t Y t ]. However, in the following, we are going to assume that Q[Z t Y t ] is identical to the path probability generated by the sensing law of the individual sensing, T S (z |z, y ), to clarify the difference of individual and common sensing. While common sensing is not biologically realistic enough, most previous works on the fitness value of information only addressed common sensing, and proved that the fitness gain of common sensing is upper bounded by the directed information [26,27].
Before deriving relations between fitness gain and information measures, we will mention the limitations of the model we assumed. In our modeling above, we did not include the carrying capacity of the environment, which works to reduce the growth of individual cells when the number of cells in the environment approaches the capacity. Additionally, we also assumed that cells cannot affect the behavior of the environment. Even though these factors are biologically important and also theoretically intriguing, we did not assume them, as the following information-theoretic analysis of the population dynamics cannot be applicable at this moment. We touch on potential extensions of this work to include these factors in our Discussion.

Fitness of a Population with Individual and Common Sensing
The fitness of a population with individual sensing Ψ i [Y t ] and with common sensing Ψ c [Y t , Z t ] can be defined respectively as where We define a pathwise historical fitness [28] K and path probabilities for phenotypic and signal histories respectively. Then, by using Equations (1) and (2), we can explicitly represent the fitnesses [26][27][28][29] as where · P[X t ] is the average with respect to P[X t ], and P F, Here, is the Kramer's causal conditioning, which indicate a causal relation between the conditioning and the conditioned histories [30,31]. Using the path representation of the fitnesses, we can define the time-backward retrospective path probabilities as where N Y t [X t , Z t , Y t ] is the number of cells at time t that have phenotypic and individual sensing histories, X t and Z t , under the realization of environmental history Y t . Similarly, N Y ,Z t [X t , Z t , Y t ] is the number of cells at time t that have phenotypic history, X t , under the realization of environmental and common sensing histories Y t and Z t . Thus, P i B and P c B can be interpreted as the probabilities of observing a phenotypic history X t when we trace the phenotypic history from time t to time 0 in a time-backward manner, retrospectively [26,27,29]. In contrast, P F [X t Z t ] is the probability of observing X t when we trace the phenotypic history in a time forward manner [26,27,29]. The difference between the two is attributed to the impact of selection, which can be characterized by investigating a population after selection, retrospectively.

Stochastic Trajectories of Individual and Common Sensing
In order to provide numerical examples of the difference between individual and common sensing, we consider a Markovian environment with three states, S y = {s The two phenotypic states, s x 1 and s x 2 , are assumed to be adapted specifically to the nutrient A-rich state s The sensing signal has two states, S z = {s z 1 , s z 2 }, which correspond to the nutrient A-and nutrient B-rich environments, s y 1 and s y 2 , respectively. A cell in the case of individual sensing, or cells in the case of the common sensing, receive s z 1 or s z 2 with high probability when the environmental state is s y 1 or s y 2 , respectively. If the environment is in the nutrient-poor s y 3 state, a cell or cells obtain s z 1 or s z 2 with equal probability. Here, the sensing is assumed to be memory-less as T S (z |z, y ) = T S (z |y ), and, thus, its stochastic behavior is defined by a transition matrix, T S (z|y), for individual sensing, and by T F E (z|y) for common sensing (Figure 2c In order to compare individual and common sensing, we set the accuracy of sensing to be equal, T S (z|y) = T F E (z|y), for all y ∈ S y and z ∈ S z . Finally, a cell is assumed to switch into phenotypic state s x i with high probability when it receives a sensing signal s z i for i = {1, 2} ( Figure 2d): where the phenotypic switching is set to be memory-less T F (x |x, z) = T F (x |z). Given these conditions, Figure 3 illustrates the population dynamics of cells with individual sensing Figure 3a,b and with common sensing Figure 3c,d under two different realizations of the environment. For the first realization, shown in Figure 3a Z t ] (see red and blue solid lines in Figure 3e), whereas, for the second realization (Figure 3b,d,f) Figure 3f). This clearly illustrates that the fitness advantages of individual and common sensing are strongly dependent on the actual realization of the environment and the common sensing signal. When common sensing produces a correct signal by chance, the population with common sensing can enjoy a higher fitness gain than that with individual sensing. By contrast, the population with common sensing loses fitness when the signal is incorrect. Figure 4

also shows the behaviors of
can fluctuate significantly, depending on the realizations. However, an ensemble average of the fitness shows that Ψ i Q is greater than Ψ c Q , at least for this specific instance (the red and blue solid lines in Figure 4a).

Value of Individual Sensing is Always Greater than that of Common Sensing
In order to characterize the fitness difference between individual and common sensing in general, we derive a detailed fluctuation relation for the fitness difference g[Y t , (9) and (10) as (see also Appendix A.2 for the derivation). By assuming that the statistical property of common sensing is the same as that of individual sensing, Q[Z t Y t ] = P S [Z t Y t ], as in Figures 3 and 4, we obtain the average fluctuation relation (FR) as where is the Kulback-Leibler (KL) divergence between the time-forward sensing behavior, P S [Z t Y t ], and the time-backward behavior, P i B [Z t |Y t ] (see also Appendix A.3 for derivation). Together with the non-negativity of the KL divergence, the average FR indicates that the average fitness of individual sensing is always greater than that of common sensing by G ≥ 0. As individual and common sensing are assumed to have the same statistical property, the source of the gain G is attributed to the individuality of the sensing. In the case of individual sensing, the organisms receiving the correct signal by chance grow more than those that receive incorrect signal do. Thus, the retrospective signal histories P i B [Z t |Y t ] are biased by the selection from the time-forward signal histories P S [Z t Y t ]. The gain G is exactly this bias, quantified by the KL divergence. No such gain is obtained from the common sensing, because the sensing signal is common to all organisms and, thus, no bias is induced by selection. This result clearly indicates that the fitness value of individual sensing cannot be properly evaluated by considering only the time-forward behavior of the signal and the environment. Even though individual sensing gains more fitness than common sensing does, on average, as demonstrated in Figure 4a, g[Y t , Z t ] fluctuates significantly and common sensing can gain more fitness than individual sensing does, by chance (Figure 3b,d and Figure 5a). From the detailed FR for g[Y t , Z t ] Equation (15), we also derive the integral fluctuation relation (IFR): which clarifies that g[Y t , Z t ] fluctuates, such that the positive g[Y t , Z t ] balances the negative g[Y t , Z t ] to satisfy the equality. The integral FR is also verified numerically in Figure 5a,b.

The Gain of Fitness by Individual Sensing
We further investigate Ψ i [Y t ] to clarify how the fitness of the organisms with individual sensing is shaped. To this end, as in a previous work [27], which investigated the fitness value of common sensing, we additionally assume that k(x, y) can be decomposed as e k(x,y) = e k max (y) T K (y|x) [27]. In this decomposition, k max (y) can be interpreted as the maximum replication rate that would be attained if the organisms allocated all their metabolic resources to adapt only to the environmental state y. Therefore, under this extreme allocation, the organisms die out under the environmental states other than y. The decomposition effectively means that we presume that each phonotypic state can be characterized by how a cell allocates its metabolic resources to different environmental conditions. T K (y|x) is the fraction of metabolic resources allocated to the environmental state y in a phenotypic state x. Thereby, we call T K (y|x) metabolic allocation strategy of the organisms in this work. The biological motivation behind the metabolic allocation is the problem of generalist and specialist. In order to adapt to a changing environment, a cell has essentially two tactics: One is to equip a cell with single fixed phenotypic state that distributes the metabolic resources over all the possible environmental states. Thereby, such a cell can, at least, evade extinction under any environmental state. The other is to switch between multiple specialized phenotypic states, each of which allocates the metabolic resources to a small number of environmental states. In this case, each phenotypic state runs the risk of extinction, but such risk is hedged as a population by stochastic switching of the phenotypic states. These two are continuously interpolated, and the optimal one depends on the way the environment fluctuates. The introduction of the metabolic allocation strategy enables us to consider a wider spectrum of biological adaption in which the character of each phenotypic state is also optimized evolutionarily.
By defining the historical fitness, Equation (4), is decomposed as By introducing this decomposition into Equation (9), we obtain , the average of which is known to bound the average fitness of a population without sensing [27] (see Appendix A.4 for derivation). By taking the marginalization with respect to X t and Z t , we have where Since the average of Ψ 0 [Y t ] is the tight bound of the fitness without sensing, σ[Y t ] is the gain of fitness by individual sensing. Here, P KFS [Y t |Y t ] is the probability that an organism allocates its metabolic resources to an environmental history Y t when it experiences environmental history Y t . Thus, P KFS [Y t |Y t ] measures the probability that the metabolic resource is correctly allocated to the actual environmental history Y t , and 1 − P KFS [Y t |Y t ] is the probability of an incorrect allocation. In other words, P KFS [Y t |Y t ] characterizes how accurately the individual sensing, phenotypic switching, and metabolic allocation together respond to the actual environment. From an information-theoretic viewpoint, this cascade from environment to metabolic allocation via sensing and phenotypic switching is very similar to the auto-encoding and decoding of information Y t via multiple layers [32]. The sensing works as the encoding of an environmental history Y t into Z t . The signal-dependent phenotypic switching is the processing of the encoded signal in the internal layers. The metabolic allocation is the decoding process to recover the original information, Y t , from X t . Under this interpretation, P KFS [Y t |Y t ] determines the statistical correspondence between the encoded information Y t and the decoded information Y t , and P KFS [Y t |Y t ] is the probability that the encoded data Y t is correctly decoded as Y t . Therefore, the total fidelity can be quantified as Formally, similar quantities to σ[Y t ] and γ t were introduced by Sagawa and Ueda as the coarse-grained entropy production and the efficiency parameter of feedback control in information thermodynamics [33,34].
is a path probability. By combining this with Equation (22), we have By taking the average with respect to Q[Y t ], we obtain Equations (25) and (26) can be regarded as detailed and average FRs, respectively, with respect to As Ψ 0 Q is the tight upper bound of the average fitness without sensing, this relation means that γ t is an upper bound of the fitness gain from individual sensing. Moreover, γ t is an intrinsic quantity of the population, in the sense that it is determined irrespective of the actual statistical law of the the behaviors of which are illustrated numerically in Figure 5c,d.

Connection with Other Information Measures
In order to link the quantities σ and γ t with other common information measures, we further assume that the environment is Markovian: and that the sensing is memory less as Then, we obtain the joint time-forward probability for Y t and Z t and its Bayesian causal decomposition (see Appendix A.5 for derivation) as In this decomposition, are path probabilities generated by two pairs of new transition probabilities that are obtained by Bayes' theorem as T B S (y t+1 |z t+1 , y t ) := where T B S (y t+1 |z t+1 , y t ) is the Bayesian posterior of the environmental state, y t+1 , given the information of the sensed signal z t+1 and the previous environmental state y t . In this procedure, we switch the causal order of y t and z t by using Bayes' theorem. Then, by using Equations (15) and (21) can be rearranged as ] is the pointwise directed information from Z t to Y t (see Appendix A.6 for derivation). This is another detailed FR with individual sensing, the average version of which can be obtained by taking the average with respect to is the directed information [31]. Directed information is an extension of mutual information of two trajectories by considering the causal relationship between the trajectories. Similarly to transfer entropy, the directed information quantifies the causal dependency between two trajectories. Transfer entropy is related to the upper bound of the rate of the directed information [35]. Their integral version is illustrated numerically in Figure 5e we can immediately see that Equations (35) and (36) are equivalent to the detailed and average FRs, respectively, for the fitness with common sensing: and These relations for the common sensing were originally derived in [27]. For a given and fixed sensing property, T S (z τ |y τ ), the maximum gain of the average fitness by common sensing is shown to be bounded by where the equality is attained when D loss = 0. D loss is the loss of fitness due to an imperfect implementation of a sequential Bayesian inference, and becomes 0 if and only if the phenotypic switching strategy, P * F [X t Z t ], and the metabolic allocation strategy, P * K [Y t X t ], are jointly optimized to implement the Bayesian sequential inference as P An instance of the optimal metabolic allocation and phenotypic switching strategies is T * K (y|x) = δ x,y and T * F (x |x, z) = T B E (y |z, y) y =x ,y=x , when S x = S y . These optimal strategies mean that, for each phenotypic state, all metabolic resources are allocated to one of the environmental states, and the phenotype switches with the probability that is exactly the same as the Bayesian posterior of the environment given the sensing signal. In other words, cells with phenotype x can survive and grow only when the environment is in the state to which all the metabolic resource is allocated under the phenotype x, and the cells change their phenotypic state by calculating the Bayesian posterior probability of the next environmental state given the common sensing signal.
In contrast, in the case of individual sensing, the Bayesian inference is no longer optimal, as G is dependent on the strategies of phenotypic switching and metabolic allocation, and {P * F , P * K } may not be the maximizer of G. This fact is more clearly shown as where Ψ i * and G * are obtained by inserting P * F and P * K that satisfy D loss = 0. Equivalently, and max This inequality further indicates that the maximum average fitness gain from individual sensing for a fixed sensing strategy is greater than or equal to the directed information plus G * , which means that the sequential Bayesian inference is no longer optimal. It is optimal in the case of the common sensing as the sensing signal is common and the subsequent phenotypic diversification by following the sequential Bayesian inference can hedge the risk of the error optimally. In the individual sensing, in contrast, stochastic individual sensing automatically induces a diversification in a population, which makes subsequent diversification by following Bayesian posterior suboptimal and redundant. Moreover, the information measure of the sensing, such as directed information, may not be an appropriate quantity to capture the efficiency of the overall decision-making process with individual sensing.

Discussion and Future Works
These results indicate that σ[Y t ] and γ t are more relevant quantities for characterizing the fitness gain from the individual sensing. From the average FR of σ[Y t ]: the maximization of σ Q is reduced to balancing the maximization of the total fidelity γ t and the . As both γ t and P γ [Y t ] depend on the actual strategies of organisms, there exists a tradeoff between them, in general.
In the analogy of autoencoding and decoding, γ t becomes higher when each input Y t is decoded more correctly.
is minimized when the relative fidelity for Y t matches the probability, Q[Y t ], that the environmental history Y t appears, since P γ [Y t ] measures the relative fidelity of decoding Y t , given Y t as encoding information. From the definition of P γ [Y t ], Equation (24), P γ [Y t ] ≤ e −γ t must hold for each Y t . If the total fidelity γ t is fixed and small enough to satisfy max Y t Q[Y t ] ≤ e −γ t , balancing sensing, phenotypic switching, and metabolic allocation to satisfy P γ [Y t ] = Q[Y t ] becomes the optimal strategy to maximize σ . This observation suggests that, under biologically realistic situations with moderate total fidelity, P γ [Y t ] = Q[Y t ] can be regarded as a proxy of the optimal strategy with individual sensing. If the total fidelity is too high to violate max Y t Q[Y t ] < e −γ t , however, D[Q[Y t ] P γ [Y t ]] = 0 cannot be achieved, and more complicated optimization is required.
These investigations in conjunction with the analogy of the problem with autoencoding and decoding, show that in order to understand the decision-making of cells and organisms with individual sensing, we should consider a joint optimization of sensing, phenotypic switching, and metabolic allocation, rather than an optimization of a part of them with the other fixed and given [36]. In the evolution of cellular and organismal decision-making, these three factors are concurrently subject to natural selection, and we have to frame this problem appropriately. This challenge may lead to a deeper understanding of thermodynamics with feedback, because similar quantities to σ[Y t ] and γ t have appeared already in the problem of feedback efficiency in information thermodynamics [33,34]. Moreover, the analogy of the problem with auto-encoding may pave the way to link the field of machine learning and deep learning with that of evolutionary biology and optimization.
Finally, we should note that all the information-theoretic relations derived in this work as well as others in previous works basically assume no cell interactions and no feedback from organisms to the environment. Even though extending the relations to relax such assumptions is a difficult problem, it can substantially expand the applicability of the information-theoretic approach to various biological problems.
Appendix A.2. Derivation of Equation (15) where we used Equations (9) and (10) in the first line and marginalized the numerator and the denominator with respect to X t in the second line.
Appendix A.3. Derivation of Equations (16) and (17) By taking the average of Similarly, by taking the average of Thus, Appendix A.4. Derivation of Equation (21) From Equation (9), where we used Equation (22) to obtain the last equality. By rearranging the first and the last terms, we have Appendix A.5. Derivation of Equation (30) From Equations (28) and (29) where we used Equation (33) and (34) to obtain the third line and p(y 0 |z 0 )p(z 0 ) = p S (z 0 |y 0 )p E (y 0 ) from Bayes' theorem.
Appendix A.6. Derivation of Equation (35) By marginalizing the numerator and denominator of Equation (21) with respect to X t , we have By rearranging this equality, we have where we use Equation (15) and the definition of i[Z t → Y t ]. Thus, we have