These authors contributed equally to this work.

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Thermodynamics constrains the flow of matter in a reaction network to occur through routes along which the Gibbs energy decreases, implying that viable steady-state flux patterns should be void of closed reaction cycles. Identifying and removing cycles in large reaction networks can unfortunately be a highly challenging task from a computational viewpoint. We propose here a method that accomplishes it by combining a relaxation algorithm and a Monte Carlo procedure to detect loops, with

Starting from the discovery by Lavoisier concerning the relation between respiration and combustion, thermodynamics stands as a key physical framework for understanding metabolism and physiology, from single cell to whole organisms. When applied to a given metabolic reaction network, at the simplest level, thermodynamics requires that, in non-equilibrium steady states, fluxes of matter proceed downhill in the underlying Gibbs (free) energy landscape. Violations of this rule (which corresponds to nothing but the second law of thermodynamics) are signaled by the existence of unphysical cycles in flux configurations [

The reference modeling scheme that we shall consider here is given by the so-called constraint-based models [_{mr}_{r}_{m}_{r}

Solutions of

Checking the thermodynamic feasibility of a flux pattern can be made straightforward. Denoting by _{mr}_{m}_{m}_{r}_{r}

Finding all cycles in a directed (bipartite) network is, at the heart, an integer programming problem in the NP-hard (Non-deterministic Polynomial time class) [

The strategy we present here combines a relaxation algorithm and a Monte Carlo method to allow for the thorough analysis of thermodynamic infeasibilities on genome-scale metabolic networks of unprecedented size. More precisely, loops will be found by applying Monte Carlo to

The structure and rationale of the method we propose are discussed in detail in Section 2, together with a brief summary of the network reconstructions we shall employ. Section 3 exposes our results, while our conclusions are reported in Section 4.

The human Reactome Recon-2 [

In addition to the global reconstruction, [

We have also analyzed the reconstructed metabolic network of the bacterium

In each case, the key information we employed is encoded in the stoichiometric matrix

Before describing the algorithm in detail, we briefly recall the idea behind the procedure. We do not directly assess whether the flux configuration is loop free, but we try to compute the chemical potentials that satisfy

The overall structure of the algorithm is reported in

In a few words, and referring to points A, B, C.1, C.2 and D shown explicitly in the flow chart:

Input: the input information includes a stoichiometric matrix,

Compute the matrix,

Update the vector,

Perform a Monte Carlo computation, as described in Section 2.2.3., in order to find a solution of system

Output: a thermodynamically feasible flux vector.

Flowchart of the algorithm for counting and removing cycles employed in this study. See text for details.

In the following sections, we shall describe the sub-procedures (relaxation method, Monte Carlo and cycle removal) of the algorithm in detail. A C++ code, which performs each of the above steps, is provided as

This routine, corresponding to point C.1 of the flow chart, allows one to retrieve a solution of _{t}_{t}

As said above, cycles generically correspond to solutions of _{r}_{r}

It is worth noting that, based on the above discussion of the relaxation method, the number of reactions to be included in the above procedure equals the number of distinct reactions appearing in the list, which is usually much smaller than

Once a flux cycle has been identified, there are multiple ways to remove it and re-organize the flux pattern, while still preserving all constraints and, eventually, the values of objective functions.

To clarify the situation, consider the following simple example with four reactions, pictured in

Example of a toy reaction network. The black dots are two metabolites, to each of which corresponds a mass balance constraint. Each line is labeled with the name of the flux carried by a reaction, and the arrow indicates the conventional forward direction of the fluxes. Evidently, a thermodynamically infeasible cycle is present if _{1} and _{2} have the same sign.

_{1} and _{2} are “internal” fluxes, while _{3} and _{4} are an intake and an outtake flux, respectively. The stoichiometric matrix _{3} = _{4} = 2, and add a lower bound on the first flux, _{2} ≥

After the elimination of the uptakes, the internal stoichiometric matrix and flux vector (which, for clarity, in this section, we denote as ^{int}^{int}_{3} and _{4}, we can move them to the right hand side of the mass balance constraints, obtaining:
_{2} ≥

Given a solution ^{a}^{int}^{a}^{a}^{int}^{a}^{int}^{a}

^{a}_{2} ≥

With this notation, the space of vectors _{1} ∩
_{2}, where:

Suppose now that, in our example, we are given the flux vector ^{*}^{int}^{*}^{int}_{2} ≥ _{1} and _{2}, if both active, operate in the same direction. The constraint _{2} ≥ _{1} and
_{2}, are then given by:
_{2} ≥ 0 is an irreversibility constraint. In this case, the intersection of the two sets above imposes

From this last observation, we can deduce the following general loop removal strategy (which we refer to as the “local” correction strategy): for a given loop ^{a}^{a}_{r}_{r}

Other possible loop-removing procedures are based on the minimization of some norm of the fluxes, as suggested in [_{p}_{r}_{r}^{p}_{1} is the so-called “Taxicab” norm, _{2} is the square of the Euclidean norm, ^{*}_{p}^{a}^{*}_{a} L^{a}^{a}_{p}^{a}

If at least one of the fluxes
^{a}

If all fluxes are non-zero, the vector ^{a}

Therefore, the vector ^{*}_{p}

The argument can be easily extended to include irreversibility constraints. Let ^{*}_{p}_{r}_{1}, _{2}, …}. We shall instead denote by
_{r}^{*}^{*}_{p}_{r}_{0} and _{r}_{0}. Given this, one can now proceed along the same lines as before, because, for any vector ^{a}^{int}

If some reaction, _{0}, then ^{a}

Otherwise, we can demonstrate that ^{a}_{p}^{*}^{a}^{a}

Problems may arise, as before, when boundary conditions like _{r}

If _{2} > _{p}^{*}

If − 1 ≤ _{2} = ^{*}

If _{2} =

In summary, the global minimization of the norm _{p}

As a proof of principle, we have applied our method to search and enumerate all independent infeasible loops of a large metabolic network reconstruction for the bacterium

In

The number of independent loops identified in the metabolic network of

We identify 196 loops (189 of which turn out to be of a size of three or more) after having generated about 80,000 random configurations, and no new loops appear upon enlarging the test ensemble. The loops thus found are listed in Supporting File 1, and a histogram of the cycle lengths (in terms of the number of reactions involved) is displayed in

Histogram of the length (number of reactions involved) of the 196 independent cycles detected in

We note that, in [

We now move on to the identification of thermodynamic infeasibilities in the human Reactome Recon-2 [

As almost all metabolic objective functions, the biomass reaction of Recon-2 contains ATP hydrolysis, representing the energetic requirements associated with cell duplication, which are not explicitly accounted for by the flux organization. As such, requirements are typically large: the stoichiometry of ATP in the biomass reaction is often two orders of magnitude larger than that of the other chemical species. Hence, ATP tends to be the limiting factor for biomass production, and FBA solutions will often organize metabolic fluxes, so as to produce as much ATP as possible. This however turns out to lead, in Recon-2, to a violation of thermodynamics. In particular, in the FBA solution for Recon-2, we detect a huge number of cycles involving the active and passive transport of a metabolite through a membrane, as, e.g., for the transport of stearoyl-CoA (stcoa) from cytosol (c) to peroxisomes (x), namely (see

Typical structure of an infeasible loop created by two reversible transports across a membrane-enclosed compartment: a passive one (by diffusion) and an active one (requiring the expenditure of energy). If the active process is allowed to reverse and the cargo re-enters the cell or compartment

ATP-coupled reactions are a common, though not the only, source of thermodynamic inconsistencies that can be spotted in Recon-2 (see Supporting File 2 for the complete list of cycles we identified in the Recon-2-derived cell-type specific networks. It is however important to stress that they are spurious and may be identified easily by complementing Recon-2 with a maintenance reaction that mimics the energy expenditure associated with basal processes (similar to those that are present in bacterial metabolic networks) and even cured automatically (or with an automated procedure) by fixing the directionality of active transports directly in the reconstruction (when possible).

In this section, we focus on finding and correcting infeasible loops in FBA solutions of the cell-type specific human metabolic networks obtained by Recon-2. We have restricted our attention to 15 networks carrying an objective function, representing, respectively, cerebral cortex neuronal cell, liver bile duct cell, cervix uterine squamous epithelial cell, kidney tubule cell, gall bladder cell, lung macrophage, small intestine glandular cell, rectum glandular cell, smooth muscle cell, urinary bladder urothelial cell, pre- and post-menopause uterus glandular cell, pancreatic exocrine glandular cell, tonsil germinal cell and squamous epithelial cell. We first computed the FBA solutions for each of the networks _{1} norm, while fixing sinks, uptakes and objective function to the values of the FBA solution). In particular, with the local strategy, we eliminate one infeasible loop at a time, making sure that no constraint is violated by the corrected solutions, including the value of the objective function. We note, however, that the local strategy does not return a unique thermodynamically consistent network, since the final flux pattern may depend on the order with which loops are removed. We shall see that, quite generically, this strategy produces flux patterns that are more similar to the original (infeasible) solutions than those generated by the global correction strategy.

Results are shown in

Clearly, _{ab}^{a}^{b}_{ab}_{ab}^{a}^{b}_{ab}_{r}_{r}_{0}, where _{0} is a (small) threshold. Results have been obtained with _{0} = 10^{−6}, but they are robust to changes in this value. Values of the overlaps between the three solutions we consider (original FBA, FBA corrected by the local strategy, FBA corrected by the global strategy) are also displayed in

The final column of _{1} minimization provide (thermodynamically feasible) flux configurations carrying a negative Gibbs energy difference for ATP hydrolysis. A possible, simple to obtain improvement of the method we present indeed includes taking into account physiological aspects when correcting a flux configuration. We stress once more, however, that these types of infeasibilities are due to inconsistent constraints or wrong reversibility assignments that prevent the existence of feasible, energetically realistic flux patterns and can be eliminated already at the stage of network reconstruction. Our main goal here was to show that our method is capable of identifying and correcting loops. By this type of example, we prove that it can furthermore point to possible limitations of the current models.

Accounting for thermodynamic constraints in stoichiometry-based flux models, though potentially highly rewarding (in terms of the possibility to predict metabolite levels, chemical potentials, reaction free energies and reversibility), is a generically hard task. Methods that integrate directly with the constraints defining the space of viable fluxes are often computationally intensive and either presuppose prior biochemical knowledge or lead to a considerable increase in the number of parameters (or both). The technique presented here makes use of stoichiometry alone (hence, it is essentially a topological method) and allows us to accomplish two goals: on the one hand, counting and listing the infeasible reaction cycles that spur flux configurations derived from thermodynamics-free models; on the other, correcting such infeasibilities in a physically motivated manner. Indeed, we have first analyzed the genome scale metabolic network reconstruction iAf1260 of the bacterium

The work presented here extends and improves over previous studies and takes several steps to suggest controlled and motivated methods to deal with thermodynamic inconsistencies in large networks of biochemical reactions. Further improvements along the lines discussed above (requiring, e.g., more precise physiological constraints) are clearly possible. Most promisingly, however, we believe that work directed at enhancing the integration of thermodynamic constraints into flux analysis would be extremely important in light of the current efforts aimed at increasing the scope, reach and predictive power of computational models of cellular metabolism. In absence of sufficiently detailed biochemical information about metabolite levels

Overview of the results obtained for the human tissue-specific metabolic networks (with the biomass objective function). Columns are as follows. Cell type: abbreviations for the tissue-specific metabolic networks examined, for the full names please refer to the text (head of Section 3.2). _{FBA}_{FBA}_{local}_{local}_{global}_{global}_{FBA,local}_{FBA,global}_{local,global}

_{FBA} |
_{FBA} |
_{local} |
_{local} |
_{global} |
_{global} |
_{FBA,local} |
_{FBA,global} |
_{local,global} |
|||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Bile duct | 2,076 | 1,445 | 1,009 | 743 | 215 | 516 | 554 | 367 | 476 | 0.706 | 0.559 | 0.781 | + |

Cer. cortex | 2,169 | 1,494 | 1,231 | 898 | 358 | 818 | 767 | 257 | 320 | 0.750 | 0.448 | 0.629 | + |

Cerv. ut. | 1,774 | 1,171 | 1,046 | 780 | 194 | 562 | 620 | 339 | 380 | 0.666 | 0.480 | 0.735 | - |

Gall blad. | 3,073 | 2,159 | 1,666 | 1,284 | 385 | 1,514 | 1,227 | 254 | 356 | 0.751 | 0.471 | 0.521 | + |

Kidney | 3,176 | 2,212 | 1,695 | 1,285 | 414 | 1,423 | 1,196 | 142 | 449 | 0.759 | 0.469 | 0.551 | + |

Lung macroph.. | 2,810 | 1,991 | 1,313 | 960 | 223 | 817 | 779 | 606 | 587 | 0.765 | 0.681 | 0.849 | - |

Pancreas | 2,821 | 1,951 | 1,319 | 948 | 409 | 814 | 797 | 225 | 534 | 0.756 | 0.534 | 0.701 | + |

Rectum | 2,976 | 2,041 | 1,328 | 1,135 | 406 | 989 | 1017 | 259 | 399 | 0.765 | 0.560 | 0.670 | - |

Small int. | 3,179 | 2,213 | 1,385 | 1,192 | 405 | 836 | 1023 | 185 | 206 | 0.776 | 0.578 | 0.745 | + |

Smooth muscle | 1,806 | 1,222 | 1,042 | 796 | 184 | 579 | 607 | 314 | 320 | 0.677 | 0.501 | 0.747 | + |

Tonsil ger. | 2,126 | 1,421 | 1,178 | 884 | 405 | 881 | 764 | 357 | 412 | 0.667 | 0.503 | 0.644 | - |

Tonsil sqam. | 2,573 | 1,718 | 1,719 | 1,250 | 423 | 1,455 | 1,188 | 301 | 403 | 0.718 | 0.334 | 0.430 | + |

Urot. blad. | 2,874 | 1,965 | 1,597 | 1,308 | 219 | 1,111 | 1,158 | 148 | 686 | 0.760 | 0.450 | 0.613 | + |

Uterus post-m. | 2,773 | 1,973 | 1,266 | 1,095 | 305 | 736 | 927 | 303 | 389 | 0.763 | 0.578 | 0.757 | + |

Uterus pre-m. | 2,793 | 1,982 | 1,376 | 1,157 | 208 | 924 | 1022 | 259 | 582 | 0.785 | 0.507 | 0.658 | + |

This work is supported by the DREAM Seed Project of the Italian Institute of Technology (IIT). The IIT Platform Computation is gratefully acknowledged.

The authors declare no conflict of interest.

Supplementary (ZIP, 361 KB)