This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Quite generally, constraint-based metabolic flux analysis describes the space of viable flux configurations for a metabolic network as a high-dimensional polytope defined by the linear constraints that enforce the balancing of production and consumption fluxes for each chemical species in the system. In some cases, the complexity of the solution space can be reduced by performing an additional optimization, while in other cases, knowing the range of variability of fluxes over the polytope provides a sufficient characterization of the allowed configurations. There are cases, however, in which the thorough information encoded in the individual distributions of viable fluxes over the polytope is required. Obtaining such distributions is known to be a highly challenging computational task when the dimensionality of the polytope is sufficiently large, and the problem of developing cost-effective

The development of high throughput techniques now makes available a considerable number of high quality reconstructions of the metabolism of a variety of organisms, which include the stoichiometry of the biochemical reactions in the network and the underlying enzyme-gene associations [^{µ}^{µ}^{µ}^{µ}

Unluckily, addressing the above system in full generality requires knowledge about reaction mechanisms and kinetic constants (which specify how rates depend on concentrations), which is at best only partially available. (Besides, it is not entirely clear to us that, were that information fully at our disposal, simulating (1) for a genome-scale reconstruction involving thousands of reactions and metabolites would be a sensible thing to do).

Computational studies of metabolic networks therefore generally assume that the cell operates at non-equilibrium steady-state (NESS) conditions, where the concentration of the metabolites is constant [^{i }^{i}^{i}^{µ}^{µ}^{µ}^{i}^{i}^{i }^{µ}^{µ} ≠ 0). The solution space of Equation (2) is, in turn, given by:

The problem we address here concerns computational methods to sample

Here, we build on the work presented in [

Suppose that we are interested in estimating the probability distribution functions (PDFs) _{i}(_{i}(_{i}_{µ} these functions (_{i}(_{\}_{i}^{1},...,^{i}^{−}^{1},^{i}^{+1},...,^{N}_{i} ≡ _{i}(_{i}(_{µ} should distinguish between metabolites involved only in internal reactions (_{µ} is an ^{µ}_{µ} simply enforces the mass-balance constraint (1) with ^{µ}_{0}, then _{µ}(_{0}) and:
_{µ}(_{0})
^{µ}_{0}. If, however, the exchanged rate is known only probabilistically, then _{µ}(_{µ} enforces (1) in (6) by weighing all possible values of ^{µ}_{µ}. For instance, if there is no a _{µ}(^{µ}_{µ}(

To push mathematically forward expression (6), we need to do some type of approximation for _{µ→i}(_{µ→i}, a normalisation constant. In this formula, we use Latin labels (_{µ\i} = ∏_{l}_{∈ }_{µ}_{\}_{i}^{l}_{µ\i} = ×_{l ∈ µ\i}[^{l}^{l}_{l→µ}(^{l}^{l}_{i→µ} is a normalisation constant. Again, the above equations simply state the fact that, on locally tree-like graphs, the contributions to the PDFs coming from each node (reaction or metabolite) nicely factorize.

The method used to derive self-consistency Equations (9) and (10) for the conditional marginals; we only show the nearest neighbours of what one must imagine to be a large tree-like bipartite graph, where circles are reactions and squares, metabolites. (_{l→µ}(_{µ→i}(_{l→µ}(_{µ}. (_{i→µ}(_{ν→i}(

The conceptual step of removing metabolites from the system is the key that allows us to recast the problem in the set of self-consistency Equations (9) and (10), for the conditional probabilities (the reader should keep in mind that this is, however, just a mathematical trick with no biological interpretation whatsoever [_{i} is a normalisation constant. Note that Equation (11) also provides the recipe to evaluate the PDFs, _{µ}(^{µ}_{l→µ}(^{l}

As discussed in [_{y}_{z} are known densities normalised in the interval [0, 1], and _{y}_{z}_{x}(_{i}_{i}_{i}, _{i}) falling outside this domain must be rejected. This method (basically, naive Monte Carlo integration) is, hence, poised to be rather inefficient. Fortunately, we know precisely where the rejection region is, and we can rewrite Equation (12) as follows:
_{z}(_{z}(_{y}(_{z}(_{x}(

The key point of this method is that the reweighted density, _{z}(

The great advantage of using wBP is that, at fixed

Avoiding rejection. As explained in the text, from the original integration region, (_{x}(_{y}(_{z}(

The running time

In order to sample the solution space ^{i}^{i}^{i }^{1},...,^{K}

While the sampling measure of KHR is well controlled, a word needs to be spent on the algorithmic mixing time. For the standard

A cartoonish representation of the polytope in a

Finally, it is worth mentioning that the matrix Φ can be easily obtained with any standard algebra software or by standard SVD algorithms, and that no matrix inversion is required to compute the projected matrix Ψ nor to convert the obtained sample to the original, full dimensional space. Therefore, to sum up, KHR uniformly samples the solution space ^{3}).

We have applied the wBP and KHR algorithms to the study of the metabolic network of the hRBC. As mentioned in

We run the wBP algorithm by representing the marginals, like Equation (10), with sets of variables/weights _{i}^{3} × _{i}(_{µ}(^{5} wBP extractions to achieve a higher accuracy. We report the results in

Results for human red blood cell. Here, we draw a pictorial representation of the system as a directed bipartite graph. Reaction nodes are plotted with their PDFs and metabolite nodes with green squares. Arrows entering (resp. leaving) a reaction stand for a substrate (resp. a product). We have plotted the marginals, _{i}(_{µ}(

Results for human red blood cell. The probability density functions of the reaction rates; reaction names are the same as [

Concerning the results obtained by KHR, we have been particularly careful to make sure we obtain a uniform distribution of the solution space

In this work, inspired by techniques employed in the statistical mechanics of disordered systems, we have presented a novel method to estimate distributions of reaction fluxes in constraint-based models of metabolic networks. The wBP methodology has, in our view, clear advantages when compared with alternative approaches. If compared to rejection-based Monte Carlo methods [

wBP can also be integrated with optimization-based flux balance analysis (FBA), as it easily allows us to evaluate the PDFs of the enzymatic rates close to optimality (assuming a score function is known) by just injecting

We have also compared the performance of wBP against the KHR method. The latter is a controlled Hit-and-Run Monte Carlo taking place in the null space defined by the set of internal reactions, where a considerable effective dimensional reduction can be achieved. Indeed, starting from the original

The validation of the wBP algorithm in the hRBC network, which can be considered a benchmark for the sampling problem for constrained metabolic models, opens the door to future applications of the method to more relevant organisms, such as

We thank de Martino, D., Güell, O., Guimerà, R., Sales-Pardo, M., Serrano, M.A. and Sagués, F. for useful discussions and comments. F.F.C. would like to thank funding from MINECO(grant FIS2012-31324) and AGAUR(grant 2012FI B00422). F.A.M. acknowledges financial support from European Union Grants, PIRG-GA-2010-277166 and PIRG-GA-2010-268342. A.D.M. is supported by the DREAM Seed Project of the Italian Institute of Technology (IIT). The IIT Platform Computation is gratefully acknowledged.

The authors declare no conflict of interest.