2.1. Mathematical Statement of the Problem
Suppose that we are interested in estimating the probability distribution functions (PDFs)
Pi(
x) for each reaction flux
i = 1,...,
N. By definition, they are given by:
where Vol(
S) is the volume of set
S.
Pi(
x) can be mathematically written as an integral over all fluxes, but
xi, of a set of functions enforcing the constraints that define the polytope, namely (1). Denoting by
Fµ these functions (
µ = 1,...,
M), we can write
Pi(
x) for flux
i as:
where
x\i = (
x1,...,
xi−1,
xi+1,...,
xN) denotes the collective integration variable,
is the domain of integration (the set-product of the ranges of variability of all fluxes, except flux
i) and
Zi ≡
∫dxPi(
x) is a normalisation constant, so that each
Pi(
x) is properly normalised to one. Each indicator function
Fµ should distinguish between metabolites involved only in internal reactions (
µ ∈
for brevity) and metabolites that are exchanged with the surrounding. A convenient parameterisation is given by:
where
δ(
y) is a Dirac
δ-distribution and
ρµ is an
a priori distribution for the exchange rate
γµ. Under this choice, for intracellular metabolites,
Fµ simply enforces the mass-balance constraint (1) with
γµ = 0 in Equation (6) through a
δ-function. For exchanged chemical species, the situation is slightly more complex. If the exchange rate takes a fixed value
z0, then
ρµ(
z) = δ(
z −
z0) and:
corresponding, once inserted in Equation (6), to the mass-balance constraint (1) with
γµ =
z0. If, however, the exchanged rate is known only probabilistically, then
ρµ(
z) can be a non-trivial distribution and
Fµ enforces (1) in (6) by weighing all possible values of
γµ according to the measure
ρµ. For instance, if there is no a
priori information about the exchange rate, then
ρµ(
z) can be taken to be uniform. Note that when
γµ is a random variable, one can consider it as another unknown rate, so that one could also be interested in estimating its
a posteriori distribution
Pµ(
γ). The problem we want to face is that of computing quantities like Equation (6) for all
i’s.
2.2. Weighted Belief Propagation
To push mathematically forward expression (6), we need to do some type of approximation for
S. Following [
11], we assume the bipartite graph that describes the interdependency of reactions and metabolites to be locally tree-like. In such a case, we are supposing there are no (or only very long) cycles connecting the reactions that process a given metabolite
µ. (In the following, we shall write
i ∈
µ to indicate that reaction
i processes, either as a substrate or as a product, metabolite
µ.) Thus, if we imagine removing metabolite
µ from the system, all reactions
i ∈
µ become (approximately) statistically independent, as they belong to separate branches of the metabolic (tree-like) network, and their joint PDF factorizes. This is explained pictorially in
Figure 1a. If we now put metabolite
µ back, we see that, at a fixed value,
x, of reaction
i, the probability
Lµ→i(
x) that mass balance holds for metabolite
µ can be expressed, in terms of the factorized PDF computed in absence of
µ, as (see
Figure 1b):
with
Lµ→i, a normalisation constant. In this formula, we use Latin labels (
i,
l,...) for reactions and Greek ones (
µ,
ν,...) for metabolites, while the script,
l ∈
µ\
i, denotes the reactions that process
µ except reaction
i. Accordingly, we defined the shorthands,
dxµ\i = ∏
l ∈ µ\idxl and
Dµ\i = ×
l ∈ µ\i[
ml,
Ml]. The quantity,
Pl→µ(
xl), is the PDF of flux
l taking a value
xl, when metabolite
µ is removed. Those PDFs are, in turn, given by the probability, for each reaction
i, to satisfy the mass balance conditions for all the metabolites they process, except
µ (see
Figure 1c), namely:
Here, the set
ν ∈
i\
µ stands for all metabolites
ν, processed by reaction
i, except
µ, and
Pi→µ is a normalisation constant. Again, the above equations simply state the fact that, on locally tree-like graphs, the contributions to the PDFs coming from each node (reaction or metabolite) nicely factorize.
Figure 1.
The method used to derive self-consistency Equations (9) and (10) for the conditional marginals; we only show the nearest neighbours of what one must imagine to be a large tree-like bipartite graph, where circles are reactions and squares, metabolites. (a) Metabolite µ and the reactions that process it (l ∈ µ). If we assume removing µ from the system, all reactions connected to it belong to disjoint branches of the metabolic network, highlighted with the dashed lines. As a consequence, their joint probability distribution function (PDF) factorizes in the product of the marginals, Pl→µ(x), of each reaction l. (b) When metabolite µ is put back in the graph, the probability, Lµ→i(x), of satisfying its mass balance condition when fixing the flux of reaction i to x depends on the marginals, Pl→µ(x), of all neighbours, but i, and on the indicator function, Fµ. (c) The marginal Pi→µ(x), which is computed in absence of µ, expresses the probability that i satisfies the mass balance conditions for all the metabolites it processes (η ∈ i), except µ. On a tree, each mass balance condition is independent, so that the probability of satisfying all of them is given by the product of the various Lν→i(x).
The conceptual step of removing metabolites from the system is the key that allows us to recast the problem in the set of self-consistency Equations (9) and (10), for the conditional probabilities (the reader should keep in mind that this is, however, just a mathematical trick with no biological interpretation whatsoever [
10]). Once the fixed point of the system formed by Equations (9) and (10) is known, one can compute the actual PDFs of the fluxes in the metabolic network as:
where
Pi is a normalisation constant. Note that Equation (11) also provides the recipe to evaluate the PDFs,
Pµ(
γ), for the exchange rates,
γµ, once the conditional marginals,
Pl→µ(
xl), are known.
As discussed in [
9,
11], the difficulties of the problem lie not so much in the derivation of Equations (9) and (10), but in devising an efficient method to solve them. In wBP, we tackle the issue by representing the marginals (10) through a collection of
variables and associated weights, rather than discretizing them as one would normally do when facing a similar problem. Let us illustrate the idea with a fairly simple example. Consider the integral:
with the extra condition that
x ≥ 0, where
y,
z are known densities normalised in the interval [0, 1], and
C is a normalisation constant. To evaluate (12), we could use Monte Carlo integration and draw
pairs of random variables
according to the distributions,
y and
z. Correspondingly, an estimate for
x(
x) can be written as:
where the term, Θ(1 −
yi −
zi), accounts for draws for which the quantity
x = 1 −
yi −
zi must be rejected due to the condition
x ≥ 0. The latter condition indeed defines a feasible triangular region in the integration plane,
yz (see
Figure 2), such that every extraction (
yi,
zi) falling outside this domain must be rejected. This method (basically, naive Monte Carlo integration) is, hence, poised to be rather inefficient. Fortunately, we know precisely where the rejection region is, and we can rewrite Equation (12) as follows:
Now, Equation (14) does not contain a rejection region, but we cannot apply Monte Carlo integration just yet, since
z(
z) is not normalised in the interval [0, 1 −
y]. Introducing the corresponding weight:
we can, however, re-cast Equation (14) in the form:
with
z(
z|
y) ≡
(
z)/
w(
y). The distributions appearing above are now properly normalized. Therefore, to evaluate Equation (16), we can simply draw
pairs
according to
y(
y) and
z(
z|
y), respectively, and estimate
x(
x) by:
i.e., by
pairs of variables and weights
.
The key point of this method is that the reweighted density,
z(
z|
y), has a
y-dependent support, such that rejection never occurs. Thus, at a price of computing a weight,
w(
y), we overcome the whole rejection issue, and the method becomes much more efficient.
The great advantage of using wBP is that, at fixed
, its running time goes as
O(2
Nk), where
k is the average number of metabolites processed by each reaction. Thus, as opposed to sampling techniques that have normally super-linear mixing times [
14], wBP only scales linearly with the number of reactions (see
Figure 3), making it an ideal candidate for application to genome-scale metabolic networks. In the present work, we focus, however, on the relatively small case of the hRBC, so that we are able to compare with sampling methods that yield a uniform exploration of the solution space
S [
8]. Due to the nature of such methods (see next section), this type of comparison is still not feasible for larger systems. This, and the fact that previous results are available [
9], make the metabolic network of the hRBC the ideal testing ground for wBP.
Figure 2.
Avoiding rejection. As explained in the text, from the original integration region, (
y,
z) ∈ [0, 1] × [0, 1], only the one below the line
z = 1 −
y contributes to
x(
x). However, in this lower triangle, the density,
y(
y)
z(
z), is no longer normalised. This is easily dealt with by reweighting the integral.
Figure 3.
The running time t of the weighted Belief Propagation (wBP) algorithm vs. the number of reactions, N. For each value of N, we average here over 10 random synthetic metabolic networks, each having M = N/2 metabolites. The algorithm (blue circles) scales linearly with the system size; a linear function, t ∝ N (green dashed line), is plotted to guide the eye.
2.3. The Kernel Hit-and-Run (KHR) Algorithm
In order to sample the solution space
S and obtain exact PDFs of individual fluxes for the hRBC by a controlled method that guarantees uniformity, we have developed an optimized version of Hit-and-Run Monte Carlo, which we call the
Kernel Hit-and-Run method. Let us start by re-writing constraints (2) explicitly for metabolites involved in internal reactions and the rest:
We note that the set of |
| equations in (18) defines the null-space of
ξ, and geometrically corresponds to a family of hyperplanes passing through the origin
x = 0. Let us denote the dimension of the null space of
ξ as
K. Clearly,
K would be at least
N − |
| (actually
K =
N − |
| when
ξ has full row rank, which can always be made to be the case and which we assume from now on). This means, obviously, that, although the number of variables in the system is
N, due to the constraints in the model, the actual dimension of the solution space
S is only
K. As in real metabolic networks most reactions are internal, the dimension
K of the null space will be significantly smaller than the original dimension of the problem
N. Additionally, it turns out that the way to implement in practice such a dimensional reduction is quite straightforward: suppose that a basis of the null-space has been found, e.g., through Gaussian elimination or singular value decomposition (SVD), and let us denote as
y = (
y1,...,
yK) the system of coordinates with respect to such a basis, so that we can write each flux in this basis as
, with Φ an
N ×
K matrix related to the change of basis between the original space and the null subspace. Plugging this into Equations (19) and (20) allows us to write:
where we have defined the projected stoichiometric matrix, Ψ, with entries
. The set of Equations (21) defines a
K-dimensional polytope in the null space (see
Figure 4), which can be sampled
uniformly by using the Hit-and-Run algorithm [
8,
15,
16]. Finally, to go back to the original space, that of the reaction rates, we simply use the fact that
. The sampling properties of the
Hit-and-Run algorithm under the uniform measure were indeed mathematically proven [
8], and in our case, it is very easy to see that the uniform measure in the
K-dimensional null space is preserved under a linear transformation, so that the final sample in the full-dimensional space is also uniform by construction.
While the sampling measure of KHR is well controlled, a word needs to be spent on the algorithmic mixing time. For the standard
Hit-and-Run algorithm, this scales as the square of the dimensions times the diameter of the polytope,
i.e., in practice, cubically with the number of dimensions [
8,
14]. Yet, as mentioned before, in our approach, the dimension of the polytope is
K, rather than
N. This can yield a significant reduction in computation times if
K is small compared to
N, as will quite generally be the case. For the hRBC, for instance, we pass from
N = 46 (which can be problematic, e.g., for Monte Carlo rejection [
17]), to a much more modest
K = 12, which is sampled quite fastly by KHR. Note also that no additional constraints need to be introduced to enclose the polytope (as opposed to [
13]). This is due to the fact that the
M − |
| metabolites that are exchanged with the environment suffice to bound the polytope in the null space.
Figure 4.
A cartoonish representation of the polytope in a K-dimensional null space spanned by y-coordinates. Here, the green dashed lines represent the set of hyperplanes (21) enclosing the polytope.
Finally, it is worth mentioning that the matrix Φ can be easily obtained with any standard algebra software or by standard SVD algorithms, and that no matrix inversion is required to compute the projected matrix Ψ nor to convert the obtained sample to the original, full dimensional space. Therefore, to sum up, KHR uniformly samples the solution space S, with a mixing time that scales as O(K3).