A Statistical Thermodynamical Interpretation of Metabolism

The metabolic network of a cell can be decomposed into discrete elementary modes that contribute, each with a certain probability, to the overall flux through the metabolism. These modes are cell function supporting, fundamental pathways that represent permissible 'quantum' states of the metabolism. For the case that cellular regulatory mechanisms for pathway fluxes evolved in an unbiased way, we demonstrate theoretically that the usage probabilities of individual elementary modes are distributed according to Boltzmann's distribution law such that the rate of entropy production is maximized. Such distribution can be observed experimentally in highly evolved metabolic networks. Therefore, cell function has a natural tendency to operate at a maximum rate of entropy generation using preferentially efficient pathways with small reaction entropies. Ultimately, evolution of metabolic networks appears to be driven by forces that can be quantified by the distance of the current metabolic state from the state of maximum entropy generation that represents the unbiased, most probable selection of fundamental pathway choices.


Introduction
Elementary modes represent unique, minimal sets of reactions enabling the operation of specific pathways that are embedded in a metabolic network [1][2][3][4][5][6].The reactions produce or consume external OPEN ACCESS metabolites that accumulate in the system, and they link internal metabolites that are at steady state with constant concentrations in the cells.The number and the type of elementary modes can be rigorously determined with recently developed computational tools [7][8][9][10] providing a powerful means to quantitatively evaluate the complexity, robustness, and capabilities of a metabolic network [11][12][13].
The overall operation of the metabolism has to be viewed as a weighted average of all possible elementary modes that individually contribute to the metabolic flux in the cell [14].The identification of these contributions is a challenging problem that has been previously addressed with different approaches [15,16].We have previously made the experimental observation that the individual contributions of elementary modes to the overall metabolism appear to be correlated with the standard reaction entropy of individual elementary modes as defined by their overall reaction stoichiometry.Using concepts from statistical mechanics and statistical thermodynamics we develop here a theory that shows that such correlation is in fact expected.It is shown that the usage probability of individual elementary modes is distributed according to the Boltzmann distribution law such that the rate of entropy production is maximized.Moreover, the theory predicts the path that evolution takes when systems are not at the fully evolved, maximum entropy generating state.

Results and Discussion
An elementary mode can be formally represented by a vector whose elements define all the reaction rates (fluxes) of the metabolic network.The non-zero fluxes in such a vector define the pathway that represents a specific elementary mode.If cells use only glucose as the carbon and energy source, all pathways start with the uptake of glucose.The elements of the elementary mode vector are then conveniently normalized to the uptake rate of glucose.The metabolic flux through the network can be viewed as a weighted average of all elementary modes.Thus, the knowledge of all possible elementary modes provides a useful tool for evaluating metabolism in quantitative terms since the rate of individual reactions is the weighted average of all elementary modes, expressed as: where the index j refers to the jth reaction step in the network and the index i to the i th elementary mode.R j is then the j th reaction rate of the metabolic network, the reaction rates r i,j are defined by the n elementary modes, and w i is the fractional contribution of the i th elementary mode to the reaction j.Due to the normalization step, the unit of R j and of r i,j is the glucose uptake rate.The reactions leading to and from external metabolites are normally the transport reactions through the cell envelope.They define the overall reaction equation, and the ratios between their rates and the glucose uptake rate represent the yields or stoichiometry coefficients in the overall reaction equation.The utilization of individual elementary modes is subject to cellular regulation that must coordinate the expression and metabolic regulation of enzymes to support the operation of individual elementary modes at specific rates.This poses the challenging question whether the utilization of specific reaction sequences, as defined by elementary modes, follow certain principles and laws that govern the magnitude of the weighting factors that quantify the flux through specific elementary modes.The identification of some law that is the basis for these quantities would provide an understanding of the principles that are responsible for the behavior of a cell and ultimately explain what drives evolution.
The overall growth equation can be expressed on the basis of the overall reaction stoichiometry that connects the reactants (nutrients) to the products.The overall reaction is defined by the external metabolites.The rates of disappearance of reactants and accumulation of products in the cell environment are part of the metabolic network and represent individual reactions that meet the terms of Equation (1).However, the pathways expressed in individual elementary modes have different overall stoichiometries, relating the external metabolites, than the overall growth equation.But the overall growth equation is recovered through application of the linear combination of individual elementary modes as shown in Equation (1).
Entropy is a path independent state function.Therefore, the reaction stoichiometry between the external metabolites of the overall growth reaction and between the external metabolites in individual elementary modes permits computation of the entropy of reaction as: where S is the entropy of reaction.It represents the amount of entropy produced per mole of glucose reacted if the stoichiometry coefficient associated with glucose equals one.m S are the molar entropies of the k reactants and products that appear in the reaction equation with stoichiometry coefficients  m [17].Multiplication of the entropy of reaction with the rate of glucose consumption results in the rate of entropy production.However, the glucose consumption rate represents only a multiplication factor and effects related to entropy formation can be analyzed based on reaction entropies.
In order to use these quantities for practical applications they have to be estimated.The molar entropy of individual compounds can be calculated from their enthalpy and their free energy of formation using the Gibbs relationship (S = (H -G)/T).The standard molar enthalpies and the standard molar free energies of the individual compounds can be estimated from correlations with the degree of reduction of the compounds [17,18].In practical applications, an accurate value of the molar Gibbs free energy should account also for the concentration of the compound which may not be known in all situations.However, the concentrations change the standard values only to a small degree.For instance, the standard Gibbs free energy for glucose is 2,872 kJ/mole, and at a concentration of 10 mM glucose the free energy is 2,860 kJ/mole.Therefore, the standard molar quantities of the thermodynamic properties provide a reasonable approximation useful in practical applications because there is usually no other option to estimate the values [17,18].
The overall reaction entropy S TOT can be computed based on the macroscopic reaction stoichiometry between external metabolites using Equation (2), or from: where the individual reaction entropies of elementary modes, S i , computed using Equation ( 2), contribute to the overall value with weighting factors w i .Since the reaction entropies are expressed per mole of glucose consumed, the total rate of entropy production is obtained when S i is multiplied with the rate of glucose consumption.
We have previously made the experimental observation that individual weighting factors appear to be correlated with the entropy of reaction defined by the overall stoichiometry of external metabolites in individual elementary modes according to: where a and b are constants of the linear correlation [19].Substitution of Equation ( 4) in (3) results in an expression for the overall reaction entropy that depends only on weighting factors: This remarkable relationship is surprising as it is reminiscent of the Boltzmann distribution law describing the probability distribution of microstates that define the entropy content of a system.This relationship motivated the quest for further, theoretical investigations that could explain this result.In what follows, theoretical evidence is provided that the relationships expressed by Equations ( 4) and ( 5) are in fact expected expressions that can be derived in analogy to principles known from thermodynamics of irreversible processes and from familiar concepts in statistical mechanics.
For an open system operating at a steady state, two seemingly contradicting principles for the rate of entropy production have been proposed.The principle of minimum entropy production rate derived by Prigogine [20,21] has been an inherent part of non-equilibrium thermodynamics and applied to the description of various processes in physics, chemistry and biology.In contrast, the principle of maximum entropy production appears to be equally applicable in many other situations and appears to have even a more general validity for a recent review see [22].Both principles involve an extreme value of the rate of entropy production in an open system operating at steady state under non-equilibrium conditions.Without adopting a priori any of the two extremum principles we will first evaluate whether the rate of entropy production reaches such an extreme value under such conditions and show later the nature of the extreme value.
The reaction entropy is an extensive function as expressed by Equation (3), i.e., it is the sum of reaction entropies generated by all contributing elementary modes present.Thus, the entropy of reaction of the overall process is the sum of reaction entropies of individual elementary modes weighted according to their respective usage as expressed by Equation (3).Therefore, the overall reaction entropy can be viewed as a function of the usage probabilities of individual elementary modes present in the system: ( , , ,..., ) Please note that we interpret now the weighting factors as usage probabilities of elementary modes and use, therefore, the symbol p instead of w.The task is now to find out how the probability of using individual elementary modes is distributed among all elementary modes present such that the total entropy function reaches an extreme value.This assignment can be objectively done by satisfying the principle of Fair Apportionment of Outcomes [23] which represents a basic postulate of statistical mechanics.The Principle of Fair Apportionment states that in an unbiased and unconstrained system, all outcomes will be observed with the same probability, i.e., the system 'treats each outcome fairly' in comparison with every other outcome.While following this principle in the presence of constraints, the extremum of the entropy generating function can be found with the method of Lagrange multipliers [23].
To find the functional form of the probability apportionment that defines the extreme value of the reaction entropy one can rewrite Equation (3) in many different ways by varying the magnitude of probability assigned to a given elementary mode.This is done to enforce that assigned probabilities satisfy the multiplication rule of probability theory which is a requirement of the Principle of Fair Apportionment.While many options are possible, only one possibility yields the extreme value of the rate of entropy generation.We can arrange the assigned probabilities in a m x n matrix defined by m ways of distributing the usage probability among n elementary modes.Each probability element in this matrix follows the product rule (p i,j = v i u j ), i.e., it is defined by the product of the sum of column elements and the sum of row elements.
The method of Lagrange multipliers finds the extremum of the function S under the constraints that the sum of all probabilities must be equal to one and that the weighted sum over the columns and rows must sum to constant values (see Figure 1a) [23].The optimization problem can be expressed as finding the solution to: , 11 , 0 where and are the Lagrange multipliers that enforce the three constraints: It is interesting to note that the problem statement [Equation (7)] does not assume any form of the reaction entropy function.The type of function obtained as the solution is entirely intrinsic to the nature of the problem.
The solution to this problem is: where b and a are constants.For S TOT to have a positive value, b must be positive.In that case the extreme value is a maximum.The result can be rewritten as: Comparison of the individual terms of Equation (10) with Equations ( 3) and ( 4) shows that the expression in brackets corresponds to the individual reaction entropies S i , and the probabilities to the weighting factors.Thus, the reaction entropies S i are linearly correlated with the natural log of the probability of usage of the corresponding elementary modes.This provides the theoretical justification for the relationship that has been previously observed.

S TOT (kJ/K• mole)
The constant a is expected to be zero since the solution applies to all temperatures.And at an absolute temperature of zero all entropies are zero and the probability of producing zero entropy with any elementary mode is 1.A detailed derivation of Equation ( 10) is provided in Appendix 2.
An alternate way to get to the same result is by using the most likely distribution of probabilities of microstates of a system that is given, according to statistical mechanics, by the Boltzmann distribution law [23].In this distribution the entropy content of the macrosystem is maximized according to: The maximized entropy implies that the expression on the right side of Equation ( 11) is a maximum.If we recall that the reaction entropy of the macrosystem is composed of the weighted contributions of individual elementary modes [as stated also in Equation ( 3)]: then the right side of Equation ( 12) is a maximum if probabilities are assigned such that: (13) as this leads to the same expression as in Equation (11).Thus, the reaction entropy (and rate of entropy production if multiplied with the rate of glucose consumption) is maximized if the probabilities are distributed according to this relationship.Equation (10) has several further implications.The entropy of reaction for elementary mode i can be expressed as: Or: and with the definition of a probability, one obtains: where Q is similar to a partition function known from statistical mechanics: Comparing Equation (15) with Equation ( 16) and considering that a = 0, results in: This provides a convenient relationship to evaluate the constant b which depends only on the number of elementary modes and on their reaction entropies.The constant b expresses the constant ratio between reaction entropies of individual elementary modes and the associated usage probability that results in the maximum rate of entropy production in the system.It represents the ultimate state of a fully evolved metabolic network.The constant b is a quantity analogous to the Boltzmann constant, but it is different as it has different units.
The presented theory is supported by experimental data of byproduct secretion of wildtype E. coli [24] (see Figure 1b).The byproduct secretion pattern can be explained by the operation of four groups of elementary modes with the same overall stoichiometry (see Appendix 1).The total rate of entropy generation computed on the basis of the four experimentally determined weighting factors is in excellent agreement with the maximum entropy formation predicted by the presented theory because wildtype E. coli presumably is a highly evolved system.Furthermore, one would expect that experimental systems that are further away from the maximum entropy point of operation will eventually evolve in time towards that point (Figure 2a).An example of such evolution is shown in Figure 2b [25].16).(see Appendix 1 for detailed data analysis).
The developed relationships for the total reaction entropy of the system is very similar to the concept of information content of a system that tends to reach a maximum degree of uncertainty and is expressed as entropy [26].In fact the qualitative form of the relationship is the same.The maximum degree of uncertainty, expressed as the Shannon entropy, has been recently used to evaluate the elementary mode composition of the metabolism [29].
The evolution of living systems has been often connected with Prigogine's minimum entropy production principle [20,21].However, this principle does not appear to apply for all cases.Rather, it is only applicable in situations where multiple reactions in an open system can occur in parallel.If one fixes the condition for one reaction and allows the others to adjust freely, then the other reactions will tend to reach equilibrium thus minimizing the rate of entropy formation.But this is evidently not the case in situations where reactions are enclosed by a cell envelope, and only a limited number of reaction metabolites reach the environment.In the open steady state system the individual reactions never reach a state of thermodynamic equilibrium since all pathways start with glucose as the carbon and energy source, and glucose is constantly replenished in the continuous operation of the open system.Thus, since individual reactions are all confined to the same cellular space and the reaction conditions cannot adjust independently, the principle of maximum entropy production applies.In fact it has been shown that for a continuous stirred tank reactor (CSTR) operating near equilibrium, the theorem of minimum entropy production does not apply due to the convective flows between CSTR and its surroundings [27].
The system selects a mixture of elementary modes in the most probable way.It is a direct reflection of this principle that an inherent advantage is given to more efficient, less entropy producing modes.This is a testable hypothesis that is already supported by some experimental evidence [19,28].It will be interesting to see whether further experimental work that is directly aimed to investigate this relationship, can confirm this behavior.
Table 2 shows the secretion rates as described by Aristidou et al. [24].From these experimental data the weighting factors for the individual mode families have been computed as described in Wlaschin et al. [19].Essentially, it involves the solution of: where R is the column vector of known metabolite secretion rates, N is the stoichiometry matrix of known mode families, and w is the vector of the corresponding unknown weights.[24].The predicted weight factors are calculated from entropies of the family modes as described in Equation (14).The total entropies are calculated from the weighted sum of entropies of the family modes, wiSi.
(2) Entropy calculations in Figure 2a of the main text contain the total entropy data as described by Wlaschin et al. [19].The entropy calculations in Figure 2b are based on experimental data described by Hua et al. [25].We used the metabolic model described by Carlson and Srienc [1] containing the two gene knockouts, adhEpta (Hua et al., [25]), for elementary mode calculations.The computation yielded 16 elementary modes that can be grouped into five families of modes with the same overall stoichiometry.They are shown in Table 4 together with the entropies of reaction.Table 5 summarizes the weighting factor for the three time points during the adaptation.

Figure 1 .
Figure 1.Entropy generation as a function of weighting factors of elementary modes for E. coli under anaerobic non-growth conditions.(a) Total entropy production without constraint (black) and with the applied constraint of weighting factor w 1 = exp(-S 1 /b) = 0.549 (blue plane).The system shows a global maximum entropy production of 0.376 kJ/K-mole when entropy generation is uniformly distributed and a local maximum entropy production of 0.352 kJ/K-mole located at the intersection between the cone and the plane when entropies are constrained to the value of the reaction entropies of individual elementary modes.(b) Comparison of predicted, maximum entropy production ()with experimentally determined entropy generation () based on data reported by Aristidou et al. (1999)[24].The weighted average of entropy production of any combination of existing elementary modes is located on the gray plane (Equation (3)) while the blue surface represents combinations of elementary modes distributed according to the Gibbs measure.The blue surface touches the gray plane at the location of the constrained maximum entropy production point.The experimentally determined point is located very close to the predicted maximum entropy generation point reflecting the highly evolved metabolism of wildtype E. coli cells.(see Appendix 1 for detailed data).

Figure 2 .
Figure 2. Total entropy production as a function of weighting factors for E. coli under anaerobic non-growth conditions.(a) Comparison of predicted maximum total entropy production with experimental entropy generation () reported in Wlaschin et al. (2006) [19].The experimental point is on the gray plane and is expected to evolve in time towards the predicted maximum entropy generation point () (b) Time course of the entropy generation in an evolving system determined from data by Hua et al. (2006) [25].The experimentally determined reaction entropies are located on the gray plane and move with time during adaptation towards the predicted maximum entropy production point: (), unevolved system; (), after 30 days of adaptation; (), after 60 days of adaptation.(c) Total entropy generation as function of evolution time.With time the system is expected to reach the maximum entropy generation where the elementary mode weighting factors are distributed according to Equation (16).(see Appendix 1 for detailed data analysis).
[24] fluxes are extrapolated to a zero growth rate from measured fluxes from a series of chemostats at different dilution rates as described by Aristidou et al.[24].

Table 3
compares the weighting factors computed from the experimental data to the predicted values calculated from only the entropies of elementary modes

Table 3 .
The weighting factors and entropies for each family of modes of E. coli GJT001.

Table 4 .
Stoichiometric equations of elementary mode families for E. coli with deletions adhEpta.

Table 5 .
The weighting factors and entropies for each family of modes of E. coli containing the deletions adhEpta during adaptive evolution.The total reaction entropies per mole glucose consumed, S TOT , are the sums of products of individual weighting factors and associated family entropies S i .