Entropy in Cell Biology: Information Thermodynamics of a Binary Code and Szilard Engine Chain Model of Signal Transduction

A model of signal transduction from the perspective of informational thermodynamics has been reported in recent studies, and several important achievements have been obtained. The first achievement is that signal transduction can be modelled as a binary code system, in which two forms of signalling molecules are utilised in individual steps. The second is that the average entropy production rate is consistent during the signal transduction cascade when the signal event number is maximised in the model. The third is that a Szilard engine can be a single-step model in the signal transduction. This article reviews these achievements and further introduces a new chain of Szilard engines as a biological reaction cascade (BRC) model. In conclusion, the presented model provides a way of computing the channel capacity of a BRC.


Introduction
Information science provides a theoretical framework for understanding cell biology. Variable types of information entropy have been defined and applied for biological research. "Single cell entropy" was introduced for the estimation of the specified gene and kinase protein expression network [1,2]. Further, multicellular behaviour was analysed by a mathematical model in which individual cells interact with each other by secretion and sensing [3,4]. Immunological responses against variable antigens were quantified using entropy defined by the selective probability of amino acid residues [5]. The genetic entropy defined by DNA mutation rate is computable and useful in analysing molecular evolution [6], and the correlation analysis of the mutated gene frequency responsible for the cancer pattern development provides a useful predictive data for clinical prognosis. Further, the transfer entropy is generalised by the Kullback-Leibler divergence between two probabilistic transition statuses along a time course and a measurement of the transfer entropy enables the quantification of the information flow between stationary systems evolving in time [7,8]. Mutual entropy was defined on the basis of the correlation analysis between enzymes and metabolites such as ATP [9].
In addition to these recent develpments, significant achievements have been reported by the application of information thermodynamics to cell system that involves a feedback controller; hence, it can be an integrative system, in which information and thermodynamic entropy intersect [10][11][12][13][14][15]. Many reports on the study of information-driven works have recently been presented. For example, an information-driven artificial molecular motor device consisting of an enzyme has been reported [11].
The upper limit of the average work <w> that can be extracted from thermodynamic engine depends on the system temperature T, Boltzmann constant k B , free energy change ∆F and mutual entropy H informed by the feedback controller [12][13][14][15]: Inequality (1) implies that free energy and mutual entropy are exchangeable parameters. For the isovolumic and isothermal biological system, Inequality (1) can be simplified to w ≤ k B TH. ( As shown in below, the work extracted from an ideal Szilard engine is, From the viewpoint of information thermodynamics, the signal transduction in Escherichia coli was reported [16]. Our earlier works considered a probability that the mutual entropy may be utilised for exchanging signalling molecules along the biological reaction cascade (BRC) [17][18][19][20]. This review later particularly introduces an ideal chain of Szilard engine constituting the BRC model [17].

A Common BRC Model
Let us consider modelling signal transduction by focusing on aspects that are common to several signal transductions. In BRC, the substrate protein in the reaction may become an enzyme or modulator in the next reaction. The most well-known example of a chain reaction is the chain of phosphorylation of proteins in the mitogen-activated protein kinase (MAPK) cascade [21][22][23][24][25][26] that is shown in (4): In this BRC, the epidermal growth factor receptor (EGFR), Ras (a type of GTPase), a protooncogene c-Raf, MAP kinase-extracellular signal-regulated kinase (MEK) and kinase-extracellular signal-regulated kinase (ERK) follow the stimulation with the epidermal growth factor (EGF). Phosphatases were omitted in the above equation. This MAPK cascade is a ubiquitous signalling pathway in variable cell types, which allows growth and proliferation. The EGFR mutation promotes the enhancement of this cascade, which contributes to the tumourogenesis of the lung and other cancers [27].
To understand the essence of complicated cell signaling, the BRC model consisting of j th and reverse −j th steps can be constructed (1 ≤ j ≤ n): Each step represents an activation of the signalling molecules X j in the cytoplasm maintained by a chemical reservoir of mediator A, such as adenosine triphosphate (ATP), and an inactivation of the signalling molecules X j * by enzymes Ph j . ATP is hydrolysed into adenosine diphosphate (ADP; D in (5)) and inorganic phosphate (Pi), which modifies the amino acid residue of X j . X j and X j * denote unmodified (inactive) and modified (active) signalling molecules, respectively. The first reaction represents the ligand (L), EGF in the MAPK cascade, an extracellular molecule and stimulates X 1 , which represents a receptor (R), EGFR in the MAPK cascade, on the cellular membrane. Afterward, X 1 − L* complex promotes the modification of X 2 in the cytoplasm into X 2 * activated by Pi that originated from A, and D is produced. Further, X 2 * promotes the modification of X 3 into X 3 *. In this manner, the j th signalling molecule, X j *, activates X j+1 in the cytoplasm into X j+1 *. Following the (n − 1) th step, the signalling molecule X n * binds to the promoter region of the DNA and induces the mRNA transcription in the n th step. In addition, during the reverse BRC steps, the inactivation of X j * into X j occurs through enzymes that catalyse inactivation or through self-inactivation by X j *, in which Pi is released. Thus, j th and −j th step forms a cycle reaction consisting of activation and inactivation. Finally, a pre-stimulation steady state of the individual step is recovered. Such reaction chain schemes were previously described by Gaspard et al. and Tsuruyama [17,18,[28][29][30].

Binary Code Model of BRCs
Recent studies showed that the BRC can be interpreted as a binary code system with two forms of signalling molecules, namely an active form (X j *) and an inactive form (X j ) in individual step [18][19][20]. The total signal event number Ψ in a given BRC event, can be described using the concentration of inactive X j and active molecules X j * as follows: X represents the total concentration of signalling molecules. The logarithm of Ψ is approximated according to Starling's equation and gives Shannon's entropy S using the selection probability of X j or X j *, p j = X j /X and p j * = X j */X [19]: Selecting the j th step component of S in (7) gives: When the signal is transmitted to the j th step, concentrations of X j or X j * fluctuate, and s j is given using the probability fluctuation dp j and dp j *, s j −X p j + dp j log p j + dp j + p j * + dp j * log p j * + dp j * Because the signal has not yet reached the (j + 1) th step; hence, the entropy of the (j + 1) th step remains: s j+1 −X p j log p j + p j * log p j * Therefore, the entropy difference H j is generated between the j th and (j + 1) th step is presented as follows [17][18][19]: with and In addition, the entropy difference per single active molecule, h j , is given by Equation (11): In previous report [19], entropy current C j was introduced as follows: Accordingly, the entropy current density c j per single active molecule is given as:

Mutual Entropy in BRCs
For the evaluation of mutual entropy in Equation (15) according to information theory, let us consider the channel capacity of the j th cycle (1 ≤ j ≤ n) ( Figure 1). The natural logarithm was applied in place of the base-2 logarithm to simplify the description. The entropy h j 0 is given using q j = p j /(p j + p j *) and q j * = p j */(p j + p j *), as follows: The conditional entropies h j (j + 1|j) from the (j + 1) th step for the given j th step can be described as a linear function of q j * using the probability of the noise occurrence probability φ j and h j 0 and h(j + 1|j) are chosen in such a manner as to maximize mutual entropy, which is defined by , subject to the constraint q j * + q j = 1. The channel capacity is defined as the maximum value of mutual entropy: Figure 1. Schematic of the relationship between the j th step to (j + 1) th step (left) and the −j th step to (−j − 1) th step (right) of a simple discrete channel. The left graph shows a signal transduction and its channel capacity is expressed by C j . The right graph shows the reverse signal transduction and its channel capacity is expressed by C −j . In the reverse signal transduction, from the −j th step to (−j − 1) th step, q j transmits the signal to q j−1 , but it may transmit the signal to q j−1 * in error.
To obtain the maximized mutual entropy, the following function U j using the undetermined parameter λ is maximized using Lagrange's method for undetermined multipliers as follows [13]: Setting the right-hand side of Equations (23) and (24) to zero, and eliminating λ, we have: From q j + q j * = 1 and (25), the following can be derived: with As a result, the channel capacity of the j th step, C j , is given using (21), (26)-(28) as a maximum value of mutua entropy: The mutual entropy of reverse signal transduction is also given by the entropy h −j 0 = h j 0 = −q j logq j − q j *logq j * and mutual entropy as Figure 1). The following function, U −j for the reverse transduction using the undetermined parameter λ , is maximised as follows: In above, , and φ −j denotes the noise occurrence probability in the reverse cascade. Then: Setting the right-hand side of Equations (31) and (32) to zero, and eliminating λ , we have: Accordingly, the channel capacity C −j is given by: The channel capacity of the j th cycle step is defined and calculated as follows: Thus, we can obtain the mutual entropy as entropy difference in Equation (15).

Szilard Engine Chain as a BRC Model
Subsequently, let us consider that phosphorylation and dephosphorylation reactions form a cycle reaction that simultaneously activates the next cycle reaction in a BRC. A cycle reaction of individual step can be modelled as a Szilard engine, which may serve as a model of the conversion system [17]. The Szilard engine was established by Leo Szilard considering Maxwell's demon paradox [31,32]. In the engine model, Maxwell's demon, which is a feedback controller, utilises the position information of a single gas particle in a box that contacts with a heat bath. As an initial state, the boundary is inserted to a room at the middle position such that the controller can determine whether a single gas particle is in the left space or in the right space of the room. The information gained by the controllers is equal to one bit (i.e., left or right). In the case of the particle in the left, let the boundary be quasi-statically moved in the right orientation for recovery of the full volume of the room. In both cases, the particle isothermally expands with the movement of the boundary back to its original full volume. The extracted work is equal to k B Tln2. This process is equivalent to the system, in which the feedback controller transforms the gained information into the actual expansion work. The feedback controller system has been produced in the actual experimental study [33,34]. Thereby, let us consider the feedback controller is informed whether the signalling molecule is an active or inactive type in place of measuring the particle position.
As reported previously [17,19], the BRC for modelling can be divided into n number of hypothetical compartment fields corresponding to the individual j th steps (1 ≤ j ≤ n) that corresponds to a single Szilard engine. The diffusion rate of signaling molecule is sufficiently low because of its high molecular weight and they are hypothesised to be localized in the compartment fields. Each field contains all X j+1 * and X j+1 species (1 ≤ j ≤ n − 1), with the concentrations identical to those of X j+1 * st and X j+1 st , respectively, at the steady state. The feedback controller has the potential to recognise the molecule concentration. Subsequently, the controller selects X j+1 * or X j+1 for its transfer (Figure 2). The steps are summarised as follows when BRC proceeds: (i) When the signal transduction initiates, the controller measures the changes in the concentration of the active molecule X j+1 * and X j+1 in the j th field.
(ii) At the j th step in the signalling cascade, the feedback controller introduces ∆X j+1 * of X j+1 * to the (j + 1) th field from the j th field by opening the forward gate on the boundary in the j th field to the (j + 1) th field. Simultaneously, the controller introduces ∆X j+1 of X j+1 to the (j + 1) th field from the j th field by opening the back gate on the boundary. (iii) Subsequently, X j+1 * can flow back with the forward transfer of X j+1 from the (j + 1) th field to the j th field because of the entropy difference (see Equation (13)). X j+1 can also flow back with the backward transfer of X j+1 from the (j + 1) th field to the j th field because of the concentration gradient. (iv) In (iii), ∆X j+1 * and ∆X j+1 can quasi-statically rotate the exchange machinery on the hypothetical partition between the j th and (j + 1) th fields, which has the ability to extract chemical work equivalent to w j+1 = k B Th j+1 .
(v) As the next step, w j+1 is linked to the modification of X j+2 into X j+2 *, which further causes the concentration difference of X j+2 * introduced by the feedback controller from the (j + 1) th field to the (j + 2) th field. The next step proceeds as aforementioned in (ii) to (iii). Accordingly, replacing the suffix j + 1 by j for simplification, the chemical work w j extracted from the j th Szilard engine is given using the mutual entropy h j informed to the controller whether the signalling molecules increase or decrease according to Equations (15) and (35) [12][13][14]18,19]:

Conservation of the Average Entropy Production
Next, let us review the optimized coding way for maximizing the signal event number for a given duration in this binary coding model of signal transduction in a nonequilibrium steady system. First, the duration of signal transduction is defined in consideration of the signal orientation (i.e., forward τ j and backward τ −j ). Positive and negative values are assigned to τ j and τ −j for distinction of the signal direction. τ j represents the duration of the tentative increase in the active molecule X j *, whilst τ −j represents the duration to the recovery to the initial state. The step cycle duration is represented by τ j − τ −j .
The average entropy production ζ j and ζ −j during the signal transduction are defined during τ j − τ −j and the average entropy production rate (AEPR) is defined using a bracket < > as: where, s j is an arbitrary parameter representing the progression of a reaction event [35]. The transitional probability p (j + 1|j) is the probability of the (j + 1) th step given the j th step during τ j , and p (j|j + 1) is the transitional probability of the j th step given the (j + 1) th step during τ −j . The AEPR <ζ j > during the signal transduction from the j th to the (j + 1) th field is given according to fluctuation theorem (FT) at the steady state: The AEPR <ζ −j > from the (j + 1) th to the j th field is given: The following equation is given using signal current density c j in (17) [19,35]: Substituting the right side of Equation (17) into the right side of (41), an important result is given [19]: When the signal even number is maximised, the logarithm of the selection probability is described simply using the average entropy production rate β independent of the step number according to previous reports [17][18][19][20]: This is one type of entropy coding. Substitution of the right sides of Equations (43) and (44) into (42) gives: We used τ j << τ −j as shown in Figure 3 in (45) and (46) and sufficient long duration of the whole signal transduction according to experimental data [23,36,37]. The dephosphorylation of signaling molecule X j * takes a significantly longer time, τ −j . Subsequently, Equations (39), (40), (45) and (46) provide: In summary, we obtained the following result from Equations (43)- (47): Equations (48) and (49) implies the integration of information entropy, code length, and thermodynamic AEPR. In these equations, step numbers j and −j in ζ j and ζ −j were omitted because ζ j and ζ −j are independent of the step number. Thus, the theoretical basis of the consistency of the average entropy production rate can be obtained.
The chemical extracted average chemical work <w j > in Equation (36) from (i) to (iv) in Section 5 is calculated as follows using Equations (15), (36), (48) and (49) [18]: The summation of the right side of (50) gives the total work: and Here, σ j stands for the entropy production during τ j − τ −j at the j th step.  [36,37]. The vertical axis represents the concentration of X j *. The horizontal axis denotes the duration (min or time unit). τ j and τ −j denote the duration of the j th step and the reversible −j th step, respectively. Line X j * = X j * st denotes the X j * concentration at the initial steady state before the beginning of the signal event.

Conclusions
Signal transduction is an important research topic in life science, but quantitatively evaluating data remains difficult. This review pointed out the possibility of quantitative signalling to life scientists. The current review can be summarised in the following points: (i) The BRC can be expressed by a kind of binary code system consisting of two types of signalling molecules: activated and inactivated. (ii) The individual reaction step of the BRC can be thought of as a cycle of a Szilard engine chain, in which the process of repeats of signalling molecule activation/inactivation. (iii) The average entropy production rate is consistent during BRC. (iv) The signal transduction amount can be calculated through the BRC.
The chain of Szilard engines is a useful model to show how signal transduction in one step induces signal transduction in the next step, in which a series of chains is formed. The most important point of this model is to directly give the signal transduction amount by the exchange work according to Equation (3). The currently introduced chain illustrates that the feedback controller transfers signal molecules based on the measurement of the increase and decrease of the signal molecule. Subsequently, the exchanger molecule on the boundary between the steps can extract work between because of the entropy gradient consisting of the two types of signalling molecules. In this way, the signal transduction amount can be clearly quantified by the combination of chemical work.
Herein, let us consider the calculation of the entropy production based on the kinetics of the activation of signalling molecules according to (5). The signalling system is contacted with a chemical bath outside the system that provides ATP. The transitional rate from the j th step to the (j + 1) th step, v j , obtained using the kinetic coefficient k j for the j th step as follows: The transitional rate from the (j + 1) th step to the j th step, v −j , which is equal to the demodification (dephosphorylation) of the backward signal transduction, is given using the kinetic coefficient k −j for the −j th step: where, k j and k −j represent the kinetic coefficients. The signal transduction system remains at a detailed balance around the steady state, the homeostatic point: Combining Equations (47), (54)-(56), we obtain the following from FT: Above result contains Equation (42). Therefore, for sufficient long duration τ j − τ −j : Using the concentration of the active signalling molecules at the steady state, X j+1 st *, we have: Substitution of Equation (59) into Equation (58) produces: Here, the entropy production σ j in (53) is defined in the j th step: In Equation (61), the fluctuation of X j * is negligible during signal transduction relative to ∆X j+1 and ∆X j+1 * according to experimental data (36). The sum of the concentrations of X j+1 and X j+1 * is equal to the total concentration X j+1 0 that is kept constant because the signal transduction rate is significantly greater than the production of signalling molecular proteins. Then: Equations (54), (55) and (62) give the concentrations at steady state: Ph j+1 signifies the phosphatase concentration in the j th step. The fluctuation of the transmitted information is described as follows using an integral form of Equation (61): with A jf and A ji signify the local concentration of the mediator ATP at the initial and final state, respectively, at the j th step. ∆A j signifies the concentration change of ATP at the j th step. Thus, the total entropy production σ is simply given as follows: In above, we used the approximation log (1 + x)~x again and set A 1i equal to the initial concentration of ATP, A. Thus, ATP is the mediator of signal transduction. In an actual experiment, rigorously measuring the concentration change of ATP at individual signal steps is difficult because ATP is consumed in a variety of reactions as a basic metabolite for cell activity. Alternatively, the ratio ∆X j+1 /∆X j+1 st is negligible during signal transduction according to experimental data [36], we have from (61): As aforementioned, the right side of Equation (71) indicates that AEPR <ζ> is consistent during the cascade. Accordingly, the measurement of AEPR will provide an evidence of its consistency during the signaling cascade. In this manner, the rigorous measurement of the concentration change of active signaling molecule may provide more direct evidence in the presented theory.To date, experimental data have demonstrated that the time course of increase in active signaling molecules shows a similar time course plot, as shown in Figure 3, suggesting the consistency of the AEPR [36,37].
Further study is required to prove which signal transduction strategy a biological system will select. For example, the cell system may select a strategy to maximise signal event number during a given duration by application of non-redundant signal system; in contrast, accuracy of the signal transduction may be prioritized by application of redundant signal system. The strategy chosen for signal cascade by the cell system will likely be determined experimentally. The cost-performance of metabolomics substance tradeoffs for cellular regulatory functions and information processing will been argued by evaluation of recent experimental data [23][24][25][26]. By measuring the consumption of metabolite, Luo et al. were successful in their estimation of biological information [38]. The relationship between the ATP concentration in cellular tissues and information transmission has been vigorously studied in the analysis of nerve excitement transmission [39], and this review may suggest implications for quantitative information transmission.
The discussion developed herein has some limitations; hence, we would like to mention it at the end. A detailed balancing between modification and demodification is assumed at each step for the application of FT. Therefore, the current discussion is also possible only when the distance from the detailed balance is not great [17][18][19][20]. This also depends on how we consider the range of FT application or Jarzynski equality. FT has been applied to study a non-equilibrium system [28,29,40,41], limit cycle [42], molecular machines [43], and biological phenomenon [44], including membrane transport [45], molecular motor activity [46], and RNA folding [47]. The adaptation and extension of the current discussion to the non-linear phenomenon [48,49] and far from steady state or active matters will be the next theoretical subjects. However, at the least, interpreting the signal cycle as a Szilard engine is considered as an effective idea for thought of experiments, and a chain of the engines will serve as an actual BRC.
In conclusion, the information thermodynamics approach described herein provides a framework for the analysis of signal transduction BRC. This theoretical approach appears suitable for the identification of novel active signalling cascades among response cascades in which AEPR is consistent through the given cascade. This review presents that the binary coding system and the Szilard engine chain model may be the theoretical basis of computation of the channel capacity of BRC.