# A Novel Methodology to Estimate Metabolic Flux Distributions in Constraint-Based Models

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{7}

^{*}

## Abstract

**:**

## 1. Introduction

^{µ}the rate of change of the intracellular level of species µ, due to exchanges between the cell’s interior and the environment, then, under mass action kinetics, the intracellular concentration, c

^{µ}, of metabolite µ obeys the equation:

^{µ}> 0 (resp.γ

^{µ}< 0) if there is a net out-take (resp. in-take) of species µ.

^{i }∈ [m

^{i}, M

^{i}] (which also encode for reaction reversibility assumptions), and for exchange rates, i.e., γ

^{µ}∈ [m

^{µ}, M

^{µ}], with which conditions (1) can be written as:

^{i}≤ x

^{i}≤ M

^{i }, i = 1,...,N

^{µ}= 0), as well as exchanged species (i.e., with γ

^{µ}≠ 0). The solution space of Equation (2) is, in turn, given by:

## 2. Methodology

#### 2.1. Mathematical Statement of the Problem

_{i}(x) for each reaction flux i = 1,...,N. By definition, they are given by:

_{i}(x) can be mathematically written as an integral over all fluxes, but x

_{i}, of a set of functions enforcing the constraints that define the polytope, namely (1). Denoting by F

_{µ}these functions (µ = 1,...,M), we can write P

_{i}(x) for flux i as:

_{\}

_{i}= (x

^{1},...,x

^{i}

^{−}

^{1},x

^{i}

^{+1},...,x

^{N}) denotes the collective integration variable, is the domain of integration (the set-product of the ranges of variability of all fluxes, except flux i) and Z

_{i}≡ ∫dxP

_{i}(x) is a normalisation constant, so that each P

_{i}(x) is properly normalised to one. Each indicator function F

_{µ}should distinguish between metabolites involved only in internal reactions (µ ∈ for brevity) and metabolites that are exchanged with the surrounding. A convenient parameterisation is given by:

_{µ}is an a priori distribution for the exchange rate γ

^{µ}. Under this choice, for intracellular metabolites, F

_{µ}simply enforces the mass-balance constraint (1) with γ

^{µ}= 0 in Equation (6) through a δ-function. For exchanged chemical species, the situation is slightly more complex. If the exchange rate takes a fixed value z

_{0}, then ρ

_{µ}(z) = δ(z − z

_{0}) and:

_{µ}(y) = δ(y − z

_{0})

^{µ}= z

_{0}. If, however, the exchanged rate is known only probabilistically, then ρ

_{µ}(z) can be a non-trivial distribution and F

_{µ}enforces (1) in (6) by weighing all possible values of γ

^{µ}according to the measure ρ

_{µ}. For instance, if there is no a priori information about the exchange rate, then ρ

_{µ}(z) can be taken to be uniform. Note that when γ

^{µ}is a random variable, one can consider it as another unknown rate, so that one could also be interested in estimating its a posteriori distribution P

_{µ}(γ). The problem we want to face is that of computing quantities like Equation (6) for all i’s.

#### 2.2. Weighted Belief Propagation

_{µ→i}(x) that mass balance holds for metabolite µ can be expressed, in terms of the factorized PDF computed in absence of µ, as (see Figure 1b):

_{µ→i}, a normalisation constant. In this formula, we use Latin labels (i, l,...) for reactions and Greek ones (µ, ν,...) for metabolites, while the script, l ∈ µ\i, denotes the reactions that process µ except reaction i. Accordingly, we defined the shorthands, dx

_{µ\i}= ∏

_{l}

_{∈ }

_{µ}

_{\}

_{i}dx

^{l}and D

_{µ\i}= ×

_{l ∈ µ\i}[m

^{l}, M

^{l}]. The quantity, P

_{l→µ}(x

^{l}), is the PDF of flux l taking a value x

^{l}, when metabolite µ is removed. Those PDFs are, in turn, given by the probability, for each reaction i, to satisfy the mass balance conditions for all the metabolites they process, except µ (see Figure 1c), namely:

_{i→µ}is a normalisation constant. Again, the above equations simply state the fact that, on locally tree-like graphs, the contributions to the PDFs coming from each node (reaction or metabolite) nicely factorize.

**Figure 1.**The method used to derive self-consistency Equations (9) and (10) for the conditional marginals; we only show the nearest neighbours of what one must imagine to be a large tree-like bipartite graph, where circles are reactions and squares, metabolites. (

**a**) Metabolite µ and the reactions that process it (l ∈ µ). If we assume removing µ from the system, all reactions connected to it belong to disjoint branches of the metabolic network, highlighted with the dashed lines. As a consequence, their joint probability distribution function (PDF) factorizes in the product of the marginals, P

_{l→µ}(x), of each reaction l. (

**b**) When metabolite µ is put back in the graph, the probability, L

_{µ→i}(x), of satisfying its mass balance condition when fixing the flux of reaction i to x depends on the marginals, P

_{l→µ}(x), of all neighbours, but i, and on the indicator function, F

_{µ}. (

**c**) The marginal P

_{i→µ}(x), which is computed in absence of µ, expresses the probability that i satisfies the mass balance conditions for all the metabolites it processes (η ∈ i), except µ. On a tree, each mass balance condition is independent, so that the probability of satisfying all of them is given by the product of the various L

_{ν→i}(x).

_{i}is a normalisation constant. Note that Equation (11) also provides the recipe to evaluate the PDFs, P

_{µ}(γ), for the exchange rates, γ

^{µ}, once the conditional marginals, P

_{l→µ}(x

^{l}), are known.

_{y},

_{z}are known densities normalised in the interval [0, 1], and C is a normalisation constant. To evaluate (12), we could use Monte Carlo integration and draw pairs of random variables according to the distributions,

_{y}and

_{z}. Correspondingly, an estimate for

_{x}(x) can be written as:

_{i}− z

_{i}), accounts for draws for which the quantity x = 1 − yi − zi must be rejected due to the condition x ≥ 0. The latter condition indeed defines a feasible triangular region in the integration plane, yz (see Figure 2), such that every extraction (y

_{i}, z

_{i}) falling outside this domain must be rejected. This method (basically, naive Monte Carlo integration) is, hence, poised to be rather inefficient. Fortunately, we know precisely where the rejection region is, and we can rewrite Equation (12) as follows:

_{z}(z) is not normalised in the interval [0, 1 − y]. Introducing the corresponding weight:

_{z}(z|y) ≡ (z)/w(y). The distributions appearing above are now properly normalized. Therefore, to evaluate Equation (16), we can simply draw pairs according to

_{y}(y) and

_{z}(z|y), respectively, and estimate

_{x}(x) by:

_{z}(z|y), has a y-dependent support, such that rejection never occurs. Thus, at a price of computing a weight, w(y), we overcome the whole rejection issue, and the method becomes much more efficient.

**Figure 2.**Avoiding rejection. As explained in the text, from the original integration region, (y, z) ∈ [0, 1] × [0, 1], only the one below the line z = 1 − y contributes to

_{x}(x). However, in this lower triangle, the density,

_{y}(y)

_{z}(z), is no longer normalised. This is easily dealt with by reweighting the integral.

**Figure 3.**The running time t of the weighted Belief Propagation (wBP) algorithm vs. the number of reactions, N. For each value of N, we average here over 10 random synthetic metabolic networks, each having M = N/2 metabolites. The algorithm (blue circles) scales linearly with the system size; a linear function, t ∝ N (green dashed line), is plotted to guide the eye.

#### 2.3. The Kernel Hit-and-Run (KHR) Algorithm

^{i}≤ x

^{i}≤ M

^{i }, i = 1,...,N

^{1},...,y

^{K}) the system of coordinates with respect to such a basis, so that we can write each flux in this basis as , with Φ an N × K matrix related to the change of basis between the original space and the null subspace. Plugging this into Equations (19) and (20) allows us to write:

**Figure 4.**A cartoonish representation of the polytope in a K-dimensional null space spanned by y-coordinates. Here, the green dashed lines represent the set of hyperplanes (21) enclosing the polytope.

^{3}).

## 3. Results and Discussion

_{i}, we applied wBP 10

^{3}× t times to evaluate the average weight αi. Once convergence was reached, we used the variable/weight sets to compute the final 46 PDFs, P

_{i}(x) and P

_{µ}(γ), according to Equation (11). In this last step, we averaged the weight values over 10

^{5}wBP extractions to achieve a higher accuracy. We report the results in Figure 5 and Figure 6, where we compare our method with KHR; the agreement is excellent. The reaction PDFs obtained with both methods have indeed a very similar domain and shape in most of the cases. Notably, wBP does not perfectly capture the profile of reactions involving currency metabolites, such as ATP, ADP, NADP and NADPH. An explanation of this may lie in the fact that these compounds are highly connected in metabolic networks and likely to be involved in small loops that are not considered by the wBP method.

**Figure 5.**Results for human red blood cell. Here, we draw a pictorial representation of the system as a directed bipartite graph. Reaction nodes are plotted with their PDFs and metabolite nodes with green squares. Arrows entering (resp. leaving) a reaction stand for a substrate (resp. a product). We have plotted the marginals, P

_{i}(x), for the internal reactions together with the P

_{µ}(γ) for the exchange rates (these are the leaves on the bipartite graph). For the densities, we have used the wBP method (red filled plots) and have compared them with the KHR algorithm (blue solid lines).

**Figure 6.**Results for human red blood cell. The probability density functions of the reaction rates; reaction names are the same as [9]. For the densities, we have used the wBP method (red filled plots) and have compared them with the Kernel Hit-and-Run (KHR) algorithm (blue solid lines). Note that the flux ranges span different orders of magnitude, but still, the profiles are very smooth for both weighted population dynamics and the Kernel Hit-and-Run algorithm.

## 4. Conclusions

## Acknowledgments

## Conflicts of Interest

## References

- Bowman, S.; Churcher, C.; Badcock, K.; Brown, D.; Chillingworth, T.; Connor, R.; Dedman, K.; Devlin, K.; Gentles, S.; Hamlin, N. The nucleotide sequence of Saccharomyces cerevisiae chromosome XIII. Nature
**1997**, 387, 90–92. [Google Scholar] [CrossRef] - Feist, A.; Henry, C.; Reed, J.; Krummenacker, M.; Joyce, A.; Karp, P.D.; Broadbelt, L.J.; Hatzimanikatis, V.; Palsson, B.Ø. A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol. Syst. Biol.
**2007**. [Google Scholar] [CrossRef] - Thiele, I.; Palsson, B.O. A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat. Protoc.
**2010**, 5, 93–121. [Google Scholar] [CrossRef] - Thiele, I.; Swainston, N.; Fleming, R.M.T.; Hoppe, A.; Sahoo, S.; Aurich, M.K.; Haraldsdottir, H.; Mo, M.L.; Rolfsson, O.; Stobbe, M.D.; et al. A community-driven global reconstruction of human metabolism. Nat. Biotechnol.
**2013**, 31, 419–425. [Google Scholar] [CrossRef] [Green Version] - Kauffman, K.J.; Prakash, P.; Edwards, J.S. Advances in flux balance analysis. Curr. Opin. Biotechnol.
**2003**, 14, 491–496. [Google Scholar] [CrossRef] - Orth, J.; Thiele, I.; Palsson, B. What is flux balance analysis? Nat. Biotechnol.
**2010**, 28, 245–248. [Google Scholar] - Schellenberger, J.; Palsson, B.Ø. Use of randomized sampling for analysis of metabolic networks. J. Biol. Chem.
**2009**, 284, 5457–5461. [Google Scholar] [CrossRef] - Lovász, L. Hit-and-run mixes fast. Math. Progr.
**1999**, 86, 443–461. [Google Scholar] [CrossRef] - Braunstein, A.; Mulet, R.; Pagnani, A. Estimating the size of the solution space of metabolic networks. BMC Bioinforma.
**2008**. [Google Scholar] [CrossRef] - Mezard, M.; Montanari, A. Information, Physics, and Computation; Oxford University Press: Oxford, UK, 2009. [Google Scholar]
- Font-Clos, F.; Massucci, F.A.; Pérez Castillo, I. A weighted belief-propagation algorithm for estimating volume-related properties of random polytopes. J. Stat. Mech. Theory Exp.
**2012**. [Google Scholar] [CrossRef] - Price, N.D.; Schellenberger, J.; Palsson, B.O. Uniform sampling of steady-state flux spaces: Means to design experiments and to interpret enzymopathies. Biophys. J.
**2004**, 87, 2172–2186. [Google Scholar] [CrossRef] - Almaas, K.; Kovacs, B.; Vicsek, T.; Oltvai, Z.M.; Barabasi, A.L. Global organization of metabolic fluxes in the bacterium Escherichia coli. Nature
**2004**, 427, 839–843. [Google Scholar] [CrossRef] - Simonovits, M. How to compute the volume in high dimension? Math. Progr.
**2003**, 97, 337–374. [Google Scholar] - Smith, R.L. Efficient Monte Carlo procedures for generating points uniformly distributed over bounded regions. Oper. Res.
**1984**, 32, 1296–1308. [Google Scholar] [CrossRef] - Berbee, H.; Boender, C.; Ran, A.R.; Scheffer, C.; Smith, R.; Telgen, J. Hit-and-run algorithms for the identification of nonredundant linear inequalities. Math. Progr.
**1987**, 37, 184–207. [Google Scholar] [CrossRef] - Wiback, S.J.; Famili, I.; Greenberg, H.J.; Palsson, B.Ø. Monte Carlo sampling can be used to determine the size and shape of the steady-state flux space. J. Theor. Biol.
**2004**, 228, 437–447. [Google Scholar] [CrossRef] - Wiback, S.J.; Mahadevan, R.; Palsson, B.Ø. Reconstructing metabolic flux vectors from extreme pathways: Defining the -spectrum. J. Theor. Biol.
**2003**, 224, 313–324. [Google Scholar] [CrossRef] - Wiback, S.J.; Palsson, B.O. Extreme pathway analysis of human red blood cell metabolism. Biophys. J.
**2002**, 83, 808–818. [Google Scholar] [CrossRef] - Krauth, W.; Mezard, M. Learning algorithms with optimal stability in neural networks. J. Phys. A
**1987**. [Google Scholar] [CrossRef] - Wodke, J.A.H.; Puchalka, J.; Lluch-Senar, M.; Marcos, J.; Yus, E.; Godinho, M.; Gutierrez-Gallego, R.; dos Santos, V.A.P.M.; Serrano, L.; Klipp, E.; Maier, T. Dissecting the energy metabolism in Mycoplasma pneumoniae through genome-scale metabolic modeling. Mol. Syst. Biol.
**2013**, 9. [Google Scholar] [CrossRef]

© 2013 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

## Share and Cite

**MDPI and ACS Style**

Massucci, F.A.; Font-Clos, F.; De Martino, A.; Castillo, I.P.
A Novel Methodology to Estimate Metabolic Flux Distributions in Constraint-Based Models. *Metabolites* **2013**, *3*, 838-852.
https://doi.org/10.3390/metabo3030838

**AMA Style**

Massucci FA, Font-Clos F, De Martino A, Castillo IP.
A Novel Methodology to Estimate Metabolic Flux Distributions in Constraint-Based Models. *Metabolites*. 2013; 3(3):838-852.
https://doi.org/10.3390/metabo3030838

**Chicago/Turabian Style**

Massucci, Francesco Alessandro, Francesc Font-Clos, Andrea De Martino, and Isaac Pérez Castillo.
2013. "A Novel Methodology to Estimate Metabolic Flux Distributions in Constraint-Based Models" *Metabolites* 3, no. 3: 838-852.
https://doi.org/10.3390/metabo3030838