Modelling Cell Metabolism: A Review on Constraint-Based Steady-State and Kinetic Approaches

: Studying cell metabolism serves a plethora of objectives such as the enhancement of bioprocess performance, and advancement in the understanding of cell biology, of drug target discovery, and in metabolic therapy. Remarkable successes in these ﬁelds emerged from heuristics approaches, for instance, with the introduction of effective strategies for genetic modiﬁcations, drug developments and optimization of bioprocess management. However, heuristics approaches have showed signiﬁcant shortcomings, such as to describe regulation of metabolic pathways and to extrapolate experimental conditions. In the speciﬁc case of bioprocess management, such shortcomings limit their capacity to increase product quality, while maintaining desirable productivity and reproducibility levels. For instance, since heuristics approaches are not capable of prediction of the cellular functions under varying experimental conditions, they may lead to sub-optimal processes. Also, such approaches used for bioprocess control often fail in regulating a process under unexpected variations of external conditions. Therefore, methodologies inspired by the systematic mathematical formulation of cell metabolism have been used to address such drawbacks and achieve robust reproducible results. Mathematical modelling approaches are effective for both the characterization of the cell physiology, and the estimation of metabolic pathways utilization, thus allowing to characterize a cell population metabolic behavior. In this article, we present a review on methodology used and promising mathematical modelling approaches, focusing primarily to investigate metabolic events and regulation. Proceeding from a topological representation of the metabolic networks, we ﬁrst present the metabolic modelling approaches that investigate cell metabolism at steady state, complying to the constraints imposed by mass conservation law and thermodynamics of reactions re-versibility. Constraint-based models (CBMs) are reviewed highlighting the set of assumed optimality functions for reaction pathways. We explore models simulating cell growth dynamics, by expanding ﬂux balance models developed at steady state. Then, discussing a change of metabolic modelling paradigm, we describe dynamic kinetic models that are based on the mathematical representation of the mechanistic description of nonlinear enzyme activities. In such approaches metabolic pathway regulations are considered explicitly as a function of the activity of other components of metabolic networks and possibly far from the metabolic steady state. We have also assessed the signiﬁcance of metabolic model parameterization in kinetic models, summarizing a standard parameter estimation procedure frequently employed in kinetic metabolic modelling literature. Finally, some optimization practices used for the parameter estimation are reviewed.


Introduction
Expanding the modelling paradigm, from a microscopic description to systems biology perspective, has offered a deeper capacity to describe cell behaviour and its interaction with extracellular environment. The improvement was necessary to address complex problems related to the characterization of intracellular events [1][2][3]. Indeed, one-reaction models have then rapidly evolved to reaction network-level models, and recently reached a genome-wide level when applicable. Various mathematical approaches have been proposed to first integrate a reaction network and stoichiometry, for then integrating reaction kinetics and regulation mechanisms. While evolving, models have gained the capacity to question cell behaviour by both interpolating within and extrapolating from experimental data, for then allow extracting data that are tedious to impossible to be directly acquired, such as intracellular metabolic flux rates [4]. The predictive power that network-level models offer is an important advantage of such systemic mathematical representation compared to heuristics approaches. In fact, metabolic models have evolved to encompass higher levels of the knowledge regarding a biosystem, i.e., from gene to protein to reaction, to thus enhance the predictive power. Cell metabolism is a complex biosystem of study, even for unicellular organisms. Indeed, metabolic modelling is particularly challenging because many cellular functions are the result of a combination of factors. Therefore, unravelling causal relations and correlations between an observed phenomenon and its associated factors necessitates a model capable of describing the underlying biosystem, the network of biochemical interactions, and their connectivity and regulation. A model is still a modest approximation [5]. The merits and drawbacks of a model are determined by adequacy to describe a biological reality, further, a compromise between model correctness and the extent to which it is mathematically tractable is unavoidable [6]. Thus, simplifying assumptions must be considered making a metabolic model definite in its scope and solvable regarding available experimental data. These assumptions are a consolidation of heuristics experiences for a model, the current knowledge of the biosystem and of its interaction with its surroundings. Typical assumptions concern the cell compartments homogeneity level, a diversity in biological time scales, the level of distributing versus lumping entities of interest, among others. These assumptions are particularly of interest because they are the main factors determining the relation between an observation in reality and the result of the simulation of an abstract mathematical model. The required metabolic modelling approach is shaped to accommodate the assumptions and provide the sought-after applications [3,[7][8][9]. The metabolic network models have been developed to understand the dominant qualitative features of cell metabolic behaviour in a bioprocess [10], or in a metabolic disease [11]. A comprehensive metabolic model can also improve conventional wisdom regarding the cell metabolism by making important corrections, such as gap-filling reactions suggested by genome-wide metabolic models [12,13]. When required to interfere with cell functioning, a metabolic model can contribute to discovering new strategies [11,14,15], and to organize the disparate accumulated information into a coherent body of practical knowledge [3,16]. In a complex system such as the metabolic network of the cell, a model helps to think (and calculate) logically about what components and interactions are important and what other components can be neglected [17]. However, the extent of progress in the model application to accomplish each of these goals depends on several factors, including the industrial importance and the scientific significance of sought-after results. Based on this premise, metabolic modelling has gained substantial importance to enhance bioprocess optimization [18,19], where the increased efficiency leads to cost reduction in million-dollar order of magnitude, and to develop metabolic therapies [11,14], i.e., to push forward cancer treatment. In a broad classification, metabolic models are divided into four categories based on the capability of the model to distinguish between sub-populations of biomass (unsegregated/segregated) and the extent of recognizing the intracellular biochemical components (unstructured/structured) [17,20], (Figure 1).

Figure 1.
A broad classification of the mathematical metabolic models. The difference in the models that are shown originates from an average cell approximation and a balanced growth approximation in regard with (1) cell population, (2) a single cell. Reproduced with permission from [17].
In this review, we have clustered modelling approaches which are mostly understood as unsegregated and structured. Thus, such a mathematical model describes multi-compartment average cell and when it is desired the model extends the description to the whole population by taking into account the quantity (size) of the population, i.e., biomass concentration. Present approaches proposed in literature were compared based on their ability to elucidate the structure of a network, to then characterize allowable and empirical flux distribution, and to finally take into account flux dynamics regulation. This review is not by any means a comprehensive tabulation of the exhaustively diverse range of the published metabolic models, but a methodology motivated review of the keystones of metabolic modelling. Thus, significant trade-offs between any two categories were acknowledged. Also, from another crucial perspective, the capability of the models under steady state and dynamic state of the metabolic network is assessed. The revised modelling approaches have been used for severely different scales of metabolic systems and typically to serve distinct strategies, therefore, it deems essential to notice the context where models are primarily used to avoid confusion. When possible, a simple running example is used to illustrate the approach being discussed. The running example is first introduced in Figure 2.

Representation Based on the Graph-Theory
In this approach, cell metabolic networks are denominated Bio-Chemical Reactions Network (bio-CRN), including the reactions that govern the cell metabolism. Owing to genome sequencing technologies, annotated genomes for several organisms made possible to reconstruct metabolic network accounting for the major active constituents of the cell, i.e., proteins, DNA, RNA and other metabolites [21,22]. Modelling approaches based on the graph theory have been used to unravel network complexity by quantitative methods. In (the major variants of) this approach, the metabolic network substrates are modelled as nodes of the graph and reactions as the (directed) edges [21,23]. It has been shown, in comparative studies for different species that the graph of a metabolic network can be modelled by a scale-free connectivity structure [24]. In a scale-free structure, the probability distribution of finding a node with k connections follows a power-law function, i.e., P(k) ≈ k −λ , with λ being a species-specific positive constant. In a metabolic network context, it means that the probability of finding a substrate involved in k reactions decreases with a power-law relationship to the increasing number of reactions [23,25,26]. (refer to Figure 1 of [21] for illustration of the connectivity structures.) Reconstruction of the metabolic network for 80 fully sequenced organisms taken from eukaryotes, archaea and bacteria conclude that most of the metabolites participate in a few reactions and a few ones drive many reactions, with these latter defined as hub-metabolites [21,24,26,27]. Interestingly, this statistically heterogeneous distribution of metabolites connectivity resembles an error-tolerant system structure, which is found in human-made networks such as the Internet [28]. It was shown that random elimination of reactions due to mutations of enzyme-coding genes are tolerated by metabolic networks and the structural redundancy allows the modified metabolic networks to result in viable reorganizations of the metabolic routes, i.e., the in-silico model maintained the potential to support growth [29]. In fact, it is argued that "the small world architecture" of metabolic networks, which refers to the scale-free structure, "may serve to minimize transition times between metabolic states. . . " [26] as well. On the other hand, network behaviour, i.e., fluxes rates and distribution, is sensitive to changes of so-called hub-metabolites [24,30,31], which motivated further structural investigation of these special nodes. A sub-network mostly made of hub-metabolites is named the giant strong component of a metabolic network, and it represents the core metabolites typically consisting of less than one third of the total metabolites included in the metabolic network [24]. In [26], the authors identify a node in the substrate graph as a hub-metabolite if its degree exceeds the mean metabolite degree of the network by three times the standard deviation. Moreover, considering the most linked metabolite is ranked first in a network, and it is a shared metabolite between tens of organisms with an average rank of R, it has been shown that metabolites in higher average ranks see less deviation in their ranks among the different organisms. Conversely, the metabolites of lower average ranks, i.e., the ones participating in a few reactions, show species-specific changes in the number of reactions they are participating in.
Topological analysis of metabolic networks structure also provides the average reaction path between any two substrates. As expected, this number increases with the addition of more metabolites to the network. However, some individual metabolites then become more and more connected, thus the average length of reaction paths remains in a much narrower range compared to the range of added metabolites. As an example, if the number of metabolites integrated in the metabolic network increases from 200 to 1000, the average reaction path increases from 4 to 12 (refer to Figure 5 in [23]). Worth mentioning, the calculated average reaction path increases when the connections through the energetic metabolites, such as ATP/ADP, N ADH/N AD + and N ADPH/N ADP + , are eliminated, highlighting their significant role as co-metabolites of many reactions. Excluding these metabolites, the ten most connected metabolites are the intermediates in the glycolysis pathway (Glycerate-3-phosphate, D-Fructose 6-phosphate, D-glyceraldehyde-3-phosphate), the intermediates in the pentose-phosphate pathway (D-ribose-5-phosphate, D-xylulose-5-phosphate), pyruvate and acetyl-CoA, the precursor for purine and histidine synthesis (5-phospho-d-ribose 1-diphosphate), L-glutamate and L-aspartate [23].

Representation Based on the Petri net Theory
Petri net theory supports another topological oriented modelling approach of metabolic networks [32][33][34]. A Petri net is a directed graph, whose nodes have two different types, namely, places and transitions. Thus, substrates (and products) are considered as the places, and reactions are the transitions. In this approach, one end of an edge is connected to a metabolite and the other end is connected to a reaction. Therefore, as opposed to the graphs reviewed in Section 2.1, in Petri net graphs edges are not representative of reactions. Instead, edges have weights resembling the stoichiometric coefficients [35]. Petri net models provide a reliable analysis of the consumption/production relations but not of regulatory interactions. However, several extensions of the primary formalism have been introduced to increase the quantitative suitability of Petri net theory for biological systems analysis. For example, coloured Petri nets, stochastic Petri nets [36,37], self-modified Petri nets [38,39], and hybrid functional Petri nets [33] . Nevertheless, the Petri net representation of metabolic networks still deserves more methodological developments.

The Stoichiometric Matrix of a Metabolic Network
As for a chemical reaction, the stoichiometry of a biochemical reaction tells the relative number of moles on either side of a reaction involving an enzyme in a balanced reaction [3]. In a bio-CRN, the matrix of stacked stoichiometric vectors of all the network reactions forms the stoichiometric matrix of the reaction network. The stoichiometric models of metabolic networks for prokaryotic and eukaryotic cells are available in literature [40]. The stoichiometric matrix for a metabolic network with m metabolites and n reactions is as follows: S m×n = s i,j m×n {i = 1, ..., m|j = 1, ..., n} where the i th row consists of reaction coefficients for metabolite i, with regard to (w.r.t) all reactions of the network, s ij is the coefficient of metabolite i in reaction j. A negative s ij implies that the metabolite i is on the left side of a reaction j and thus assumed to be consumed, if positive it is receiving flux of matter in the reaction j (v j ), and zero if it does not participate in the reaction j: Therefore, we have one column vector for each reaction and Equation (2) can be written in matrix-vector notation for the whole metabolic network as follows: The stoichiometric matrix can be reshaped into four partitions, each of which with specific characteristics (Figure 2).
Structural analysis of metabolic networks can be more elegantly performed by using the matrix formalism instead of topological theories, as it is transferable to a graph representation while such transfer from the topological representations is not always possible [3,6]. Either way, the analysis is not restricted only to genome-scale metabolic networks, and it can be efficiently used in an early step of modelling to draw a non-intuitive sense of network reactions connectivity. However, relying solely on structural information to extract connectivity interactions suffers from the flawed assumption that every biochemical reaction is worthing same weight, regardless of its level of activity. Therefore, the structural analysis results can hardly be directly useful for investigating the cellular mechanisms for flux regulation [41,42]. Contribution of the stoichiometric matrix to the formulation of the further approaches is fundamental. The algebraic properties of stoichiometric matrix and nowadays matrix computation capacity has fuelled further use of the stoichiometric matrix as an important part of any metabolic mathematical models. The significance of its role in the representation of metabolic networks will become evident through the following sections.

Metabolism is a Constrained System
Approaches based on the metabolic steady state hypothesis have been first proposed for estimating theoretically active enzymatic reactions [43][44][45][46], and to estimate metabolic flux rates from the reaction network stoichiometry and experimental time-data of extracellular metabolite concentrations [11,[47][48][49][50][51][52]. The mass conservation constraint at the steady state is formulated through the stoichiometric matrix for a network of biochemical reactions at any size, and this constraint has a central role in the determination of the solution space in many approaches (following sections) [53]. The observation that the changes of intracellular fluxes occur at a faster dynamic compared to the changes of the extracellular concentrations supports the validity of the steady state assumption [54,55]. A detailed mathematical justification of the (pseudo) steady state assumption in the mathematical modelling of biochemical reaction networks is proposed in 1983 in [54], however, in the context of this review article, this assumption is valid at constant exponential growth phase in batch cultures, and in continuous cultures operating at steady state [56]. The advantage of the constraint-based modelling is that these approaches do not need to deal with enzyme kinetics and post-translational regulatory mechanisms [57]. In the following, we review the most relevant approaches and the associated constraints that are imposed on metabolic networks.

The Null Space of Stoichiometric Matrix
From a mathematical point of view, the stoichiometric matrix S, denotes a linear mapping of the reaction rate vector into the space of accumulated concentration time derivatives of metabolites [58]. To illustrate, consider N to be a linear mapping function, let: since N is linear, we can write: For the particular case of metabolic network operating at steady state, we have S (m×n) v * (n×1) = ẋ (m×1) = 0. The set of vectors v * (n×1) are linearly dependent to the members of the basis for the null space (Equation (8)) and thus, a subset of the null space of the stoichiometric matrix S (Equation (9)). Where r is the rank of stoichiometric matrix S, the basis for the null space is a set with q = (n − r) linearly independent column vectors of dimension n, which generates the null space for stoichiometric matrix S when spanned [58].
Null space(S) = span{ b 1 , . . . , b q }, q = (n − r) where K (S) denotes the basis for the null space of stoichiometric matrix S.

Metabolic Pathway Analysis (MPA)
Metabolic networks are an interconnected set of the metabolic pathways which interconnect at branch points and through shared metabolites. It is common, inside a metabolic network, to have several redundant routes for the transformation of one given metabolite to another. It is particularly true for linking external substrates to external metabolic products [59,60]. These multiple routes, which obey to mass conservation principle at steady state, are considered a main cause of biological robustness [61,62]. Thus, model development by the decomposition and characterization of a metabolic network into definitive pathways has attracted attention to allow identifying the dominant routes [9,63,64]. Particularly, MPA modelling approaches adapt a convex basis analysis methodology to estimate a theoretically feasible relative flux distribution. In MPA, component pathways of the network and genetically independent reaction pathways are identified [9,65,66], such insight is particularly of interest in gene-knockout and synthetic pathway construction studies [67], and in designing stoichiometrically growth-coupled overproduction in production host organisms [68].
MPA is usually used as an umbrella term for the approaches based on the notion of elementary flux mode analysis (EFMA). A flux mode is a flux distribution vector that satisfies the mass conservation constraint for the intermediate metabolites at steady state, and does not violate direction restriction of irreversible reactions. The term flux modes was coined by Schuster et al. [6,63] in 1994, to address the independent enzyme sets responsible for a component pathway. If it is not possible to further decompose a flux mode to a linear combination of simpler flux modes, it is considered as an elementary flux mode (EFM). In biochemical terms, it is a minimal set of enzymes "that can operate at steady state, with all irreversible reactions used in the appropriate direction" [65]. In mathematical definition, the convex basis vectors for the null space of the stoichiometric matrix of a metabolic network represent EFMs [69] (see Figure 3 and Table S1, Supplementary Materials, for properties of the different MPA methods). The algebraic tools and methodologies of MPA have been continuously enhanced to address shortcomings of the EFMA and therefore enabling EFM-based methods to claim more applicability in systems metabolic engineering. EFMs reproduce all the admissible reaction pathways at steady state in an unbiased manner, therefore, enumerated EFMs of a large metabolic model can quickly exceed computational capacity of a typical computing system of the date. However, various studies show that usually a small percentage of these calculated EFMs are biologically and thermodynamically feasible [70,71], or relevant to the metabolic scenarios under study [72]. Thus, the integration of constraints from biology and thermodynamics, in addition to the introduction of more efficient computational algorithms are utilized to ameliorate the computational cost of enumerating EFMs for genome-scale models.
Extreme pathway analysis (EPA) is one closely related method to EFMA, in its formulation, an important condition is the net direction of each reaction. Therefore, the reactions that can be reversible may flow in two different directions at different states of cell metabolism. Knowing that, EPA is assumed to extract the minimal subset of EFMs [73]. Considering the n-dimensional vector of intracellular fluxes, EFMs form a convex cone in the solution space, while extreme pathways characterize the edges of this polyhedral cone [53,71,74]. Even though EPA is computationally less demanding than EFMA [75,76], it may exclude possibly important reaction routes and thus causes loss of valuable structural information [77]. Another important mathematical framework used for MPA is Minimal Cut Set (MCS) approach, which was introduced in 2003 [78]. A MCS with respect to an objective reaction represents a set of other reactions of metabolic network, whose inactivation will make it impossible for the objective reaction to appear in any steady state flux distribution of the network. However, since implementation of MCS can simultaneously eliminate a desired function of metabolism, constrained minimal cut sets (cMCS) was introduced to circumvent this limitation by allowing for the definition of a set of desired flux modes to be preserved during the enumeration process [79]. Recently, Axel von Kamp and Steffen Klamt enhanced the computational algorithms of MCS enabling this method to operate on genome-scale level, in a comprehensive computational investigation of growth-coupled production, the authors show capability of the approach to investigate feasibility of coupling different metabolites to growth in five major bioproduction host organisms [68]. Use of these approaches complemented promising methods to identify coregulated reactions and co-expressed genes [80], and composition of the minimal required substrates in order to produce valuable metabolic products [80,81] (see Table 2.1 in [82] for a list of software packages for enumerating EFMs). Authors in [83] generated hypotheses for strain metabolic engineering guided by the MCS computation of the metabolic network of an industrial microbe (Pseudomonas putida). The experimental implementation of the guided predictions enhances titer, rate and yield of the microbe by coupling its growth to bioproduction, i.e., production starts from late exponential phase.  Figure S1, Supplementary Material, for the network of reversible reactions).

Metabolic Flux Analysis (MFA): Admissible Metabolic Fluxes
Flux quantification in a metabolic system reveals significant information about the cell physiological state. Such information, particularly characterizes the cell response to genetic manipulation or changes in its environment, thus, may be used to improve a bioprocess toward predefined optimization objectives [17,84]. Moreover, it was shown in several studies that the rates of exchange reactions, i.e., intermembrane transport reactions, can be used not only to quantify the rates of the intracellular reactions, but also to describe global processes taking place in the corresponding metabolic state of the cell [17,[84][85][86]. In a comprehensive metabolic network, the number of metabolites is typically smaller than the number of intracellular and exchange reactions combined. Thus, the associated stoichiometric matrix renders an under-determined system of equations, where the metabolites form balance equations, and the reaction rates are unknowns. When the system of equations is solved, MFA provides an empirical flux map [87][88][89], instead of the relative distribution of the fluxes provided by MPA. The mathematical formulation is as follows [55]: where the vector v u represents unknown fluxes, the vector v k the measured (known) fluxes, and S u , S k the stoichiometric matrix subsets for unknown and measured fluxes, respectively.
To have a determined system, the degree of freedom of a system must equal zero. Generally, some reaction pathways are excluded from a comprehensive metabolic network to qualify it for MFA [90], and experimental data regarding concentration of medium components are used to further decrease the degree of freedom, i.e., identifying more exchange reactions. Measurements provide the estimation of cell concentration and uptake/secretion rates, which are the characteristic parameters of cell activity. Conversely, the degree of freedom increases as a result of the branches of pathways diverging and then rejoining later in the network, reversible reactions and metabolic cycles. These structures add reaction columns to the stoichiometric matrix, which are linearly dependent to the existing ones [91,92]. For instance, the authors in [93] show that in a mammalian cell, it is essential to determine the uptake/secretion rates of ammonium and the secretion rate of either CO 2 or O 2 . Because these are the cometabolites of the TCA cycle reactions, thus, determination of them enhances the observability of the fluxes in the TCA cycle. However, generally the balance equations alone are not capable of addressing such issues, therefore, additional constraining balance equations must be introduced to determine the system and consequently solve for the unknown fluxes [91,92].

13C-Metabolic Flux Analysis
More sophisticated substrate labelling experiments can be performed to overcome limitations of MFA approach such as from reversible and cyclic reactions [85,86,[94][95][96][97]. Metabolite and isotopomer balancing have been developed since 1995 [98]. In this method, a labelled substrate (e.g., 13C-glucose, 13C-glutamate, 14N-glutamine) is administered to the cell, leading to the labelled atoms in different metabolites and in the macromolecular biomass constituents and thus as well as in the secreted products [96,99]. Henry et al. [85,86], used 13C-MFA to optimize the timing of induction of a biphasic Chinese hamster ovary (CHO) cell culture, by assessing the intracellular flux maps before and after production induction. Parallel labelling experiments as an alternative to traditional single tracer experiments [100] was used by Antoniewicz et al. to characterize flux distribution in growth and non-growth phases of CHO cell culture [52], among numerous other applications of this method [84].
Chromatography methods are used to separate labelled and non-labelled metabolites from each other (i.e., metabolite identification), then, mass spectrometry is used to measure their abundance by mass determination (i.e., metabolite quantification). Thus, use of a coupled either liquid or gas-chromatography mass spectrometry apparatus (LC-MS or GC-MS) is a prevalent strategy for metabolite profiling [94,101]. MS is particularly more popular in comparison with nuclear magnetic resonance (NMR) and liquid-chromatography massspectrometry (LC-MS), because of its good performance, extensive databases, relative cost of operation and the ease of maintenance [102,103], but exact quantification is more straight-forward with LC-MS because of the use of known calibration curve with standards. The acquired data has to undergo correction and normalization to provide the measurements in a form that has biological information content. Kanani et al. [104], discussed the main sources of biases which arise in GC-MS and troubleshooted some of these biases, following the efforts to standardize this methodology as an integral part of the metabolomics data generation [105,106].
The non-experimental stage of metabolite labelling flux analysis consists of solving the isotopomer balance equations by computational algorithms. However, the capability of these algorithms to provide interpretation of experimental data and to support strategic experimental design is limited. In fact, increasing the power of frameworks to model isotope labelling system, with the least (or no) loss of information, is an active topic of research [99,102,107,108]. Recently modified versions of 13C-MFA were used to overcome the challenges of measuring metabolic fluxes in the distinct compartments of the eukaryotic cells (e.g., cytosol, mitochondrion). In a recent article, the authors provide an updated detailed protocol to perform 13C-MFA for the quantification of intracellular fluxes [102], and in a comprehensive review the latest developments in the MFA has been described [90].

Constraints Augmented with a Hypothesized Objective Function for the Cell
The information available about carbon flow distribution in a metabolic system mostly suggests that the cell primary task is biomass synthesis, then homeostasis and maintenance of cell activities, and to a lesser extent, the production of an important product such as monoclonal antibodies. The priority and significance of these tasks change depending on the organism origin, time and mode of cell growth. Thus, the assumption that the cell metabolism has a particular predefined objective, is used to determine metabolic fluxes distribution [40,96,109]. The commonly assumed objective function for flux balance models is the maximization of biomass growth (See Table S2, Supplementary Materials, for a list of other objectives). Flux Balance Analysis (FBA) is the framework used in this approach since it allows determining ranges for the allowable flux distributions based on the hypothesized relevant biological objectives of the cell [87]. The mathematical formulation of a cell objective, accounting for defined fluxes restrictions with bounding limits for the reaction fluxes, is developed as a linear programming optimization problem (Equation (11)). max C T . v subject to: Considering that the stoichiometrically defined domain of flux vectors is dictated by the cell metabolic genotype, the solutions of an optimized objective function will be pertained within the metabolic phenotype of a specific cell-line or strain under study [53]. The arrays of the row vector C T (n) are the linear coefficients representing weights of the fluxes in the objective. The solution for one particular objective function, such as the maximal growth, may be different and even contrasting from another objective, such as for maximal production of a secondary metabolite [40,96,110]. It is exemplified in Figure 4 that how FBA is used to estimate the metabolic network flux maps for the running example.
The particular solution for the relative distribution of the fluxes can be used to interpret, describe and predict experimental results. The workflow to perform FBA is as follows: 1.
To curate metabolic reactions from the annotated genome data.

2.
To identify the topology and the structure of a network.

3.
To identify the flux vector solution space at steady state.

4.
To impose constraints and bounds on fluxes.

5.
To define hypothesized objective function of interest for a biosystem. 6.
To realize the vector space of the solution and identify the optimum solution (i.e., the flux map). The accuracy of FBA solutions is partially determined by the accuracy of constraints imposed on the fluxes, and also, by the relevancy of the biological objective function that is assumed. Thus, it is not rare that the FBA solution provides unfeasible predictions of the flux distribution. To reconcile the associated issues, the annotated genome information and the results of experimental research must be combined [111,112]. Different advancements such as the inclusion of regulatory and thermodynamics constraints in the FBA have been introduced to address the challenges associated with the application of this approach (see Table S3, Supplementary Material, for the list of FBA enhancements and Table S4 for the comparison of FBA and MFA).
The constraint-based modelling approaches are particularly useful when cellular activities involved in a particular cell physiology of interest are not well understood, hence hindering the application of mechanistic modelling approaches. The lack of knowledge of the mechanistic basis of cell activities results in the increasing uncertainty of corresponding model parameters. In such a situation, a hypothesized objective function based on the established knowledge on the global cell behaviour (i.e., growth rate) justifies the rationale behind modelling the metabolic network using constraint-based approaches. However, one of the adverse consequences of this is that the constraint-based models are generally poor at extrapolating outside the imposed experimental conditions. These models are also weak in accommodating sequential changes in the conditions, i.e., in the fed-batch regime, which are frequently occurring in bioprocesses, as well as describing medical cases related to cancer cell metabolism [15].

Thermodynamics-Based Constraints
Thermodynamics laws dictate that the direction of every spontaneous reaction is from lower entropy to higher. Thus, a living organism encompassing the vital nonspontaneous reactions must have a continual energy input and then dissipate some of this energy in the form of unusable heat, in order to continue existing. Boltzman had already understood this principle when he hypothesized that: The general struggle for existence of animate beings is therefore not a struggle for raw materials, nor for energy, but a struggle for entropy which becomes available through the transition of heat from the hot sun to the cold earth (Boltzman, 1886).
Hence, the biological systems cannot attain true equilibrium state, and their thermodynamics is studied under the non-equilibrium steady state (NESS). Classical thermodynamics can describe the direction of change for systems nearing equilibrium, however, when the distance from equilibrium state exceeds a critical value, the system may exhibit non-equilibrium structured states that are maintained only with a continual input of energy [113,114]. This is one of the reasons that makes circulation of energy in the cell possible and necessary. Thus, the direction of reactions in the metabolic pathways is determined as the result of the consistent manifestation of reactions taking place far from equilibrium. The true equilibrium is never reached unless the cell is practically dead, i.e., insignificant regulated metabolic activity is observed [115][116][117][118]. Imposing the constraints that emerge from this underlying theory narrows down the solution space of the metabolic model and thus ensures complying to the thermodynamics laws [119].

Estimation of the Gibbs Free Energy Change
A positive net change of the Gibbs free energy of a reaction (∆ r G) suggests that the reaction cannot occur spontaneously, i.e., it is endergonic, unfavourable or nonspontaneous. Conversely, an exergonic reaction is one that releases work energy and can be assumed spontaneous or favourable. By the means of Gibbs free energy, the degree of thermodynamic favorability of the reactions in a metabolic network can be quantified [120]. The spontaneity described by the Gibbs free energy change is concerned with whether or not the reaction needs continual input of energy to take place. However, the change in Gibbs free energy between substrates and products reveals no information regarding the rate of the reaction. This extensive variable has the following expression (Equation (12)) [121]: where K denotes the equilibrium constant of the reaction, Q is the reaction quotient. The standard state which is denoted by ( • ) is defined as T=298 (K), pH = 7 and concentrations of molecular compounds, except for H + , OH − and H 2 O, equal to 1 mol/L (M). Theoretically, the experimental values of a reaction equilibrium constant can be measured by making a solution of the enzyme and the molecular compounds participating in the reaction, to then allow them to reach equilibrium state; where the substrate and product concentrations are fixed and then measured to calculate the equilibrium constant [122]. A large number of chemical reactions have been investigated in this manner, and the acquired data have been collected in thermodynamics databases (see Table S5 of the Supplementary Materials). In addition, tables of thermodynamic data are available in literature such as in the extensive works of Alberty [123,124]; Thauer [125,126], and Dolfing [127,128]. However, for the majority of the biochemical reactions experimental derivation of ∆ r G • is troublesome [121], mainly, because of the lack of reliable models for metabolic chemistry, the difficulty of conducting in vivo measurements, and the loss of accuracy of in vitro values when transformed to the in vivo conditions [129]. Thus, experimental data of the equilibrium constants of metabolic reactions are scarce and mostly unreliable. To address this shortcoming, the Gibbs free energy change of the metabolic reactions is estimated as the sum of the Gibbs free energies of formation of the participating reactants and products (Equation (13)).
where s i,j is the stoichiometric coefficient of compound i in the reaction j, and ∆ f G • i denotes the standard Gibbs energy change of formation for the compound i. As a result, ∆ r G • j is calculated for the reaction j. Consequently, the problem then becomes how to estimate the Gibbs energy change of formation for the participating molecular compounds. The Gibbs energy change of formation for a compound can be calculated from its standard Gibbs energy of formation and its thermodynamic activity [130].

Group Contribution Methods (GCMs)
GCMs are statistical estimation methods for the estimation of Gibbs free energy of formation [131]. In the GCMs, it is assumed that the standard Gibbs formation energy of a metabolite is a linear summation of the formation energies of its constituent molecular substructures (or groups, denoted as P i ) [120,121,132]. Moreover, a common reference of estimation for all functional groups that are involved is defined. If a specific sub-structure appears more than once in the main molecular compound, number of the occurrences must be taken into account as a coefficient applied to the contribution of that specific substructure (N i ). The general formulation of the property calculation is as follows (Equation (14)).
Developed from the classic work of Mavrovouniotis [121], Jankowski et al. [120] provided a version of GCM tailored for biochemical networks. By this method one can estimate the standard Gibbs free energy of formation ∆ f G 0 , and consequently the standard Gibbs free energy of reaction ∆ r G 0 , based on Equation (13). In the recent years, the accuracy and the scope of application of GCM have been continually improved, but, however, there are still some issues that are limiting the confidence interval of this method estimation. As categorized by Du et al. (Figure 1 in [122]), the issues are associated with estimation accuracy, data quality and convergence, and inherent difficulties of the GCM methodology. Some promising directions to address the current shortcomings consist in gathering more curated thermodynamic data for fitting of Gibbs formation energies, and to extend current methods so they can calculate equilibrium constants as a function of temperature [133]. In addition, uncertainty at different degrees arises from several factors that affect GCM calculation, such as ionic strength, ion concentration, pH and temperature [133] and certain group interactions cause large errors in the total value of the property estimated [121,134].

Modelling Approaches Complying to the Thermodynamics-Based Constraints
In order to rule out closed reaction cycles from FBA, Energy Balance Analysis (EBA) enforces nonlinear thermodynamics-based constraints on chemical potential [114,135]. When metabolomics data is available, Network-embedded Thermodynamic analysis (NET analysis) developed by Kummel et al. [136] can be used as a computational thermodynamics-based framework for the estimation of the feasible range of the Gibbs free energy change of reactions of the network under physiological conditions. In addition to the standard Gibbs free energy change of formation of the metabolites, NET analysis requires metabolomics data, predefined directions for the reactions and a stoichiometric metabolic model. However, the lack of the direction assignment does not lead to wrong results, but less insight from the metabolomics data. Interestingly, NET analysis does not require a predefined metabolic objective of the cell and thus the identification of the underlying thermodynamic infeasibilities is not biased. The types of results that can be obtained from using NET analysis comprise ascertaining the thermodynamic consistency of measured metabolite data, prediction of concentrations of unmeasured metabolites and identification of the potential enzymes sites for regulatory actions [110,136].
In the method proposed by Hatzimanikatis and co-workers [129], named Thermodynamicsbased Metabolic Flux Analysis (TMFA), the CBMs is augmented with the Gibbs free energy change of the reactions to form a mixed integer linear programming (MILP) problem. The solution of this MILP optimization problem eliminates any thermodynamically infeasible flux distributions and, moreover, estimates feasible metabolite activity ranges. The level of insight provided by TMFA is largely determined by the number of reactions for which the Gibbs free energy change is known. In a genome-scale model of E. coli (iJR904) [137], the experimental values of Gibbs free energy change of the reactions and formations are only available for 5.6% of the reactions and 11% of the compounds, respectively. However, employing GCMs resulted in estimating the Gibbs free energy change for more than 90% of the reactions and metabolites [120], thus, allowing the implementation of TMFA on the iJR904 metabolic model. The suggested ways to overcome the superficial infeasibility of the essential reactions includes correcting for uncertainty involved in the estimation of Gibbs free energy changes and adjusting the metabolites activity ranges. The impact of such remedies is studied for the reaction dihydroorotase in the E. coli genome-scale metabolic model [129]. (For the equations and inequalities that describe the mathematical formulation of TMFA refer to Text S1 in the Supplementary Materials.) Generally, the implementation of CBMs with the thermodynamic constraints divides the reactions of a network into three categories. First, bottlenecks on flux direction, which are the reactions with ∆ r G close to zero. The second is the reactions that have exclusively highly negative values of ∆ r G , meaning they are independent of the concentration of metabolites. The reactions of this category receive a special attention as regulatory points of the cell metabolism. The third category is the reactions in which ∆ r G is highly dependent on the metabolites concentrations. These reactions can be used to determine the feasible range of concentration or concentration ratios. In many biochemical reactions, we observe interconversion of some known pairs of molecular compounds such as ATP/ADP, N AD + /N ADH and N ADP + /N ADPH. These metabolites are assumed to be tightly regulated and have a relatively complicated structure, but minimal structural difference between one another, therefore, a more accurate contribution share for their interconversion is estimated by GCMs [121]. Henry et al. [129] found the feasible ranges for N ADP + /N ADPH and N AD + /N ADH in a genome-scale metabolic model of E. coli and showed that the ratio ranges comply to the assumption that these energetic pairs are tightly regulated. In fact, the cell maintains the ratio of the former energetic pair close to the maximum allowable value, while the ratio of the latter is kept close to its minimum value. It is in contrary to the majority of metabolites where concentrations are maintained close to the logarithmic means of minimum and maximum allowable activities, suggesting the wider range of fluctuations allowed for their activities. The advantages of using TMFA can be described into three aspects [132]: • Assignment of thermodynamically feasible directions to all reactions in a network. • Elimination of futile cycles (infeasible closed cycles). • Consideration of thermodynamically coupled reactions.
In addition, by coupling thermodynamics of a metabolic network with the EFMA, one can reduce the number of relevant EFMs [119,138]. In fact, it was demonstrated that to have a thermodynamically feasible intracellular flux distribution, all the founding EFMs must be thermodynamically feasible too. Based on this, the authors of [110] discarded 46% of a total of 71.3 million EFMs generated for a compartmentalized model of central metabolism in S. cerevisiae. Peres et al. [138] compared the traditional MPA approach, with an approach where they incorporate the external metabolite concentrations in addition to the standard Gibbs free energy change of reactions. Consequently, the incorporation of external metabolites allowed for the enumeration of different EFMs in the different culture conditions and cell growth phases. However, to overcome the need of discarding the already generated EFMs and thus to decrease computational cost, tEFMA was developed that generates only thermodynamically feasible EFMs in the first place [139]. Figure 5 gives a quick overview of the context of applications of thermodynamics-based theories which are briefly explained in Table 1. Table 1. Thermodynamically constrained models along with their specifications and reference to their origin.

Theory Name Specification Reference
Energy balance analysis (EBA) Identifies reactions' direction and flux limits based on the Gibbs free energy value of reactions. [135] Network-embedded thermodynamic analysis (NET) Identifies thermodynamically feasible flux range or metabolite concentration range, based on the predefined standard Gibbs free energy and predetermined flux directions. [136] Thermodynamics-based metabolic flux analysis (TMFA) Identifies flux direction, allowable flux range and also concentration ranges. Incorporates a MILP optimization. Based on the predefined standard Gibbs free energy. [129]

Extensions of Constraint-Based Flux Balance Models Simulate Growth Dynamics
Even though enzyme activities in a cell are constantly changing as a result of metabolic regulation, the validity of the steady state assumption is justified in many cases, such as in bioprocesses in continuous cell cultures and in early exponential growth phase of batch fermentation. However, there are many practical situations like in batch and fed-batch industrial bioreactors , where cells exhibit a dynamic behaviour [140]. In recent efforts, further developments have been done on the constraint-based approaches to form dynamic flux balance analysis (dFBA) and dynamic metabolic flux analysis (dMFA) [11,47,49,50,89]. Studying a biosystem under transient conditions using dFBA and dMFA approaches uses steady state models sequentially, evaluating the state of the biosystem at each sampling time interval [141]. In this convenient way, the model can be used to estimate the dynamic changes of intracellular fluxes and to decipher the major activated metabolic pathways in the reaction network without the need for a large set of predefined kinetic parameters [142]. The particular advantage of this enhancement is its capability to describe the growth dynamics of biomass [143]. This approach uses the constraint-based models at each assumed steady state to estimate the (momentarily) growth constant, to then find a new set of the exchange rates through integrating the macroscopic mass balance equations of substrate consumption and product formation. The calculated exchange rates are then used to constrain the next iteration of the constraint-based model simulation updating the growth constant, and the iterations go forward. To have a smooth continuous simulation of the bioprocess dynamics, the time between two consecutive operation points, i.e., two consecutive cell count measurements, is modelled with a continuously differentiable function, such as Monod model. The application of this line of metabolic model development has been shown in describing dynamic growth on multiple carbon sources for a small metabolic network [144,145]. In essence such model obeys a discrete-continuous framework, where the discrete states are decided based on logical transcriptional regulatory rules and then continuous states, i.e., concentrations and fluxes, are computed according to the specific parameters of each discrete state.

Kinetic Modelling
Dynamic kinetic models attempt to provide a mechanistic description of the biosystem in terms of the cell enzymatic activity and mass balances over intracellular metabolites. Thus, the mass conservation equations for the components form a system of ordinary differential equations (ODEs). The values of the intracellular fluxes and concentrations are derived as a numerical solution of the conservation equations. Kinetic models bring about the possibility to interpolate and extrapolate dynamic behaviour of a system, in a consistent fashion, in situations other than the one on which the model was originally validated [57,146,147]. Moreover, this type of mathematical models can simulate changes in the relative activation of enzymatic reactions as a function of parameters of the system and the initial concentrations [148][149][150][151]. The general mathematical formulation is given in Equation (15) as follows: where the vector x(t) is a vector of dimension m of time-dependent state variables such as extracellular and intracellular metabolites concentration, the vector v is here a nonlinear time-dependent vector of the reaction fluxes, which depends on x(t), a vector of regulatory inputs u(t), and θ which stands for the collective set of kinetic parameters and initial states [61,152,153]. In some cases, algebraic equations or additional ODEs are added to this general representation to reflect, among others, conserved moieties total concentrations, volume changes or supplementary descriptive variables [152]. The flux vector is equivalent of the vector of biochemical reaction rates and is reported in the units of mole of reacting matter per mass of biomass per time. The inclusion of reaction kinetics in Equation (15) introduces nonlinearity to the formerly linear mapping of a reaction rate vector into a space of accumulated concentration time derivatives of metabolites (ẋ(t)) [58], and in return, provides the capability to calculate concentrations change and intracellular fluxes at transient, i.e., not only at steady state. The function f j is to provide a mechanistically valid description of reaction rates of the reactions taking place in an underlying metabolic network. The mathematical formulation of the reaction rate is driven by kinetics theories (i.e., transition state theory), thermodynamics theories (i.e., theory of equilibrium, linear and nonlinear non-equilibrium theory) and first principles (i.e., mass balances of metabolites). The impact from all the reaction rates (metabolic fluxes) where the metabolite i participated is calculated based on the corresponding row of the stoichiometric matrix S, (first term on the right-hand side of the Equation (16)), integrating over this term for a period ending at t provides the concentration of the metabolite i at t. However, the numerical solution of the underlying ODEs integration is quite troublesome due to the large number of parameters, the nonlinear nature of enzymatic reaction rates and the stiffness caused by the difference between the time scale of the underlying bioprocesses. Addressing the issues that are obstructing the use of kinetic modelling on large-scale metabolic models, i.e., genome-scale models, has been focus of the research community. In the following, the most relevant mathematical formulations are presented and their merits and drawbacks are discussed.

Approximate Kinetic Formats and the Quantification of Metabolic Regulation
Alternative approaches to mechanistic kinetic modelling that uses non-mechanistic kinetic models have primarily emerged based on a reasoning that suggests, because of the homeostasis constraint, redesign of metabolic networks does not require detailed mechanistic models [8,154]. Thus, the development of approximate kinetic formats provided mathematical equations to approximate reactions kinetics in the neighbourhood of a reference metabolic state, with a decreased number of kinetic parameters. The approximate kinetics that are formed generally have four characteristics in common: proportionality between reaction rate and enzyme concentration, capability to reach the plateau phase for the reaction rate against substrate concentration, involving the least possible kinetic parameters and providing analytical solution at steady state [155]. Approximate kinetic formats explicitly incorporate metabolite concentrations and provide a framework for the quantification of metabolic regulation [61,155-159].

Metabolic Control Analysis (MCA)
Sensitivity analysis (SA) of metabolic model parameters for determining their influence on the model simulation is crucial [160]. The initial values of metabolite concentrations and of kinetic parameters, as well as the constraints on flux rates may be considered as sensitive factors influencing model behaviour [161]. The SA divides into two major groups, namely, local and global SA. Local SA methods consider that only one input varies at a time and the perturbation results are observed in outputs, while the rest of parameters is maintained unchanged. Conversely, in global SA, all inputs vary simultaneously and the resulting effect on model output can be investigated by intense computational cost. In fact, global SA uses repeated sampling from the probability distribution of each input parameter to obtain numerical results covering the entire range of variation of the metabolic model parameters [162]. Local sensitivity analysis is widely used for characterizing the effect of parameter changes on the solution of dynamic models in the neighbourhood of a certain reference state. For a metabolic model, the linearization about the reference state is generally expressed as a first order Taylor series approximation [162][163][164]: where c i is a property of the metabolic system, i.e., the metabolite concentration, p • is the initial set of the parameters, and p g is the parameter subject to change. Explicit differentiation can be used when the model explicitly states the relation between the desired output and the input, in this case sensitivity function represents an analytical solution which can be used to calculate the sensitivity. When the output is provided as an implicit function of the input parameter(s), implicit differentiation might be used. However, in practice local sensitivity coefficients are calculated with numerical approximation [164]. Simulation is used to determine a variable of interest (i.e., steady state metabolic flux) at two different parameter values (i.e., enzyme total concentration), then various numerical forms of the Equation (17) formula are used to calculate the absolute and relative local sensitivity [2].
One of the well-known SA frameworks in the metabolic modelling context is metabolic control analysis (MCA) that is widely used to study metabolic pathways interactions [165][166][167][168]. By the use of MCA, the effect of a parameter perturbation on metabolic fluxes or metabolite concentrations can be estimated or reported in an experimental scenario [168]. The three main coefficients in MCA are defined based on the notation in [154] in Table 2.  Table   Name Mathematical Formulation Description

Control coefficients
The coefficients are a measure of Intracellular metabolite: the relative change in a flux or concentration upon a relative change of an enzyme activity level The coefficients are a measure of Intracellular metabolite: the effect of a change of an external parameter, on intracellular fluxes and concentrations Elasticity coefficient Intracellular reaction rate: The elasticity is a local measure quantifying the relative change in reaction rate upon a relative change in metabolite concentration, while maintaining other concentrations and parameters constant On theoretical grounds, dynamical stability of a stable metabolic system dictates that upon the perturbation of one state of the system at steady state, the states of the system, i.e., metabolite concentrations and intracellular fluxes, eventually returns to the same steady state. When one parameter is perturbed, structural stability ensures that the system eventually returns to the same steady state or to one close by. If the system is unstable, such perturbations result in the diverge of some or all of the states. Hence, a subset of the system states is chosen that relative to the period of the observations, the concept of stability and steady state for them is associated with the same time period, i.e., the phenomena have close enough relaxation times [168]. Then, the study of the metabolic system is confined in the neighbourhood of such stable steady state of the biosystem. Thus, in practice, MCA has a limited capacity in quantification of the control distribution within the metabolic network as the theory stays only valid for small variations in parameters, where the linearization in Equation (17) is a good approximation of the nonlinear kinetics behaviours. The rigorous mathematical formulation of MCA was exploited to derive several theorems to relate the MCA coefficients in global relationships, i.e., the summation and connectivity theorems, among others (for a mathematically rich elaboration see [167,168]). Attempts to extend the application of MCA lead to the expansion of enzyme concentration changes for which the theory holds valid [154,169]. These approximate kinetic formats are log-lin and lin-log based on the approximation procedure employed to derive each one. However, they are typically formulated directly from the stoichiometry of the metabolic network as given in Table 3. Table 3. Various approximate kinetic formats based on MCA and Biochemical System Theory (BST) using reference parameters and reference elasticities (See the following paragraph for the notation definition).

Type of Approximate Rate Law
Mathematical Formulation

Biochemical System Theory (BST)
Power-law representation of the ODEs (Equation (15)) is the key ingredient of more general Biochemical System Theory (BST) developed primarily by Michael Savageau and Eberhard Voit in the 1960s [170,171] prior to MCA development. In the two global formats of BST, namely, Generalized Mass Action (GMA) and canonical S-system representations the sums and differences of multivariate products of power-law functions are used to represent dynamics of changes in biochemical processes and metabolite pools, respectively [156][157][158][159]. Sharing much of the modelling philosophy with MCA, BST gives the same steady states solution, and the same local stability properties (at the same reference state), which implies that the different parameters of BST can be derived from the coefficients used in MCA, i.e., kinetic orders in BST from elasticities in MCA [172][173][174]. This theory has been extensively studied and discussed in several books and journal articles, one can refer to [172] for a comprehensive review.
Where m denotes all concentrations including the dependent (intracellular concentrations) and independent (extracellular concentrations) variables; k and j count reactions and variables respectively. γ represents rate constants (either measured or estimated); f denotes kinetic orders in BST representation. When reformulated to show S-system representation, the first term on the right-hand side is an aggregated term for all the reactions flowing in the pool and the second term denotes all of the fluxes leaving the metabolite pool [175]. In BST kinetic orders play a major role in the purpose of modelling, unlike mass action kinetics they can acquire negative or non-integer values. For example, an inhibitory variable can be included with a negative kinetic order. The direct biological meaning of parameters in BST representation is an advantage [172,173]. Du et al. [8] listed two general situations where there is a higher chance of rate law approximations success:

•
In the domain where the underlying assumptions of a specific rate law approximation remain valid and not substantially violated throughout simulations; • The rate laws are not the most important single factor in determination of dynamic behaviour of the network.
In other studies by Heijnen [155] and Visser and Heijnen [154], biochemical system theory (BST), generalized mass action (GMA) and lin-log approaches were compared, and authors concluded that the lin-log approach has the best approximation capacity and its solutions are valid for large changes in enzyme activities. However, Wang et al. [173] argue Heijnen's theoretical conclusion claiming that lin-log and log-lin representations might misbehave when the accompanying substrate concentrations approach zero, it is the opposite of the problem with GMA highlighted by Heijnen, where the governing equations' outcome approaches infinity for unbounded concentrations, i.e., inability to simulate saturation. In another review study, Voit proposed that for metabolic design and theoretical study of the principles of operation, S-system format can provide a better default point to start because it allows the user to perform algebraic calculations at steady states and more straightforward stability and sensitivity analysis [175].

Regulated Kinetic Metabolic Models
Flux regulation enabling continuous cellular activities resemble applying certain control strategies to maintain, increase or decrease the rate of production or consumption of biomolecular compounds (i.e., metabolites, enzymes or signalling proteins). Feedback repression and induction are two examples of these regulation scenarios in epigenetic level, which is the study of interactions in genetic level affecting gene expression, i.e., transcriptional regulation. Their counter-scenarios at metabolic levels are feedback inhibition and enzyme activation [176]. Competitive inhibition and allosteric regulation are two prevalent ways among others, which have been formulated mathematically to model enzyme activity regulation, i.e., post-translational modifications [177]. The regulated kinetic models of metabolic network reactions with the condition-specific parameters of affinity constants and maximum reaction velocities can simulate the nonlinear metabolic behaviour of biosystems in an extended time frame [178], i.e., cellular growth dynamics. The rationale justifying this practice is in two folds, first, generality of the governing assumptions for first-principle modelling covers biological systems such as the viable cell. Second, culture dynamics is a phenotypic behaviour of the viable cell during the process period. Such dynamic models either do not consider transcriptional regulation or consider it in a non-kinetic way, for example, qualitatively through Boolean logic or multi-valued logical rules [6,179]. When modelling a biological regulatory system of interest, special care must be devoted to the time frames of reactions and processes involved, as it appears commonly in biology that a system in its whole may span over various time scales of nanoseconds [180] to days [2]. As a rule of thumb, reactions that happen with a time constant one order of magnitude larger than time constant of the system are taken into account as frozen, and concentration of metabolites associated with them is considered as fixed variables. However, it is trickier to handle the faster reactions, in this case two assumptions are used to approximate underlying concentrations. First, the rapid equilibrium assumption and second the quasi-steady state (QSS) assumption [2,55]. Briefly, QSS assumption states that where the enzyme is available in catalytic amount, ES complex reaches steady state fast enough that we can assume its concentration remains constant. Rapid equilibrium assumption implies that the substrate rapidly reaches equilibrium with the enzyme-substrate complex, with the rates of both directions of the binding reaction being faster than the conversion step of the enzyme-substrate complex to enzyme-product complex (see the Convenience Kinetics (CK) section). It is worth mentioning that the time frames of regulation at the metabolic level are usually notably shorter compared to the regulations imposed at the epigenetic level. It is believed that the former takes place in the order of minutes, while the latter happens in hours [181]. Accordingly, the metabolic level regulation of enzyme activity can be viewed as fine-tuning, as opposed to the coarse control associated with regulatory mechanisms at the epigenetic level [182]. The comparative time scale of cellular operations along with a schematic representation of genetic and metabolic level regulatory mechanisms are given ( Figure 6). Metabolic regulation is increasingly considered in metabolic modelling studies, where its importance is clearly demonstrated [151,178,[183][184][185][186]. However, particularly the hierarchical nature of relaxation times in regulatory mechanisms along with the inherent interactions at signalling pathways makes the time-dependent quantification of the cellular regulations challenging [186][187][188]. The kinetic metabolic model formulations that can accommodate dynamics of the regulatory interactions between modifiers and enzymes have the advantage to be used to generate hypotheses for a wider set of experiment designs. Following the enhancements of GMA in [6,54], Drager et al. [159] reformulated GMA to describe a reaction including a modification term as follows: The F j function is used to introduce activation and inhibition effects [153]: In the following kinetic rate laws, the metabolic regulation can be accommodated explicitly and with a mechanistic justification.

Michaelis-Menten Kinetic Expression for Enzymatic Reactions
L. Michaelis and M.L. Menten [189] developed a general theory in 1913 to explain enzyme kinetics, following V. Henri [190] who had already taken important steps toward describing saturable enzyme kinetics. The mathematical model developed based on this theory is commonly used to describe the kinetic expression of the enzymatic reactions. It simulates experimental results by assigning two parameters, v + m and K M S . This model has the least error when applied on a reaction with one substrate being converted to one product after forming the substrate-enzyme complex as the intermediate [189,191]. Michaelis-Menten model is the default choice to model an enzymatic reaction [175,186], numerous studies incorporated Michaelis-Menten kinetics in their kinetic models [10,151,159,192,193]. The schematic of Michaelis-Menten mechanism and the corresponding mathematical formulations are as shown in Figure 7. The reversible Michaelis-Menten expression is given in Figure 7a for a uni-uni reaction, by equating [P] to zero it reduces to the well-known classical Michaelis-Menten equation, the formula for the case where inhibition is introduced is given (Figure 7b) [6]. In a review study Tummler et al. [153] discuss the assumptions accompanied with enzyme kinetics and how new types of experimental data can be incorporated to enhance estimation and calculation of Michaelis-Menten parameters. The experimental measurements are categorized for flux v, enzyme E T , kinetic parameters and substrate concentration S as follows: 1. Metabolic flux quantification: Carbon labelling experiments, uptake rate of nutrients, secretion rate of products. Indeed, Ghorbaniaghdam et al. [10] considered regulatory functions from energy shuttles ATP/ADP and cofactors N ADH/N ADPH in their dynamic model based on the multiplicative Michaelis-Menten kinetics, to simulate the behaviour of CHO cells. They modified flux kinetic equations to consider the effects of inhibitors and activators, and obtained satisfactory results simulating experimental data of the extracellular (and intracellular) analytics with low errors.

Convenience Kinetics (CK)
This framework is the most similar to the conventional Michaelis-Menten representation of enzymatic reactions in terms of formulating kinetics of a reversible enzymatic reaction. The authors in [7] point out thermodynamic dependency of the kinetic parameters in the rate law representation of the reactions in a metabolic network as an undesired phenomenon, because it obstructs the scan of the kinetic parameter space for parameter estimation methods, i.e., optimization algorithms. They argue that this thermodynamic dependency arises from the theory that the Gibbs free energies of formation of the metabolites are determinants of the equilibrium constants of the network reactions, and the equilibrium constants are kinetic parameters affecting the model behaviour. Thus, it is likely for a mechanistic model of metabolism to produce thermodynamically wrong outputs if such dependencies are not addressed. To resolve this issue, they derive CK from a random-order enzyme mechanism with simplifying assumptions with respect to the order and the binding energies of enzyme binding and dissociation. CK can be extended to model kinetic laws in an entire metabolic network, then a set of independent parameters are estimated and used to determine the rest of kinetic parameters. Also, the formulation justifies the regulatory terms mechanistically, in this case, the assumption is that the enzyme has sites for the activation and inhibition terms where binding mechanisms is comparable to the binding mechanisms of the formation of enzyme-substrate complexes. Parameters k A , k I in Equation (22) are defined similarly to the same parameters in Equation (20) and denote the dissociation constants for the activation and inhibition enzyme binding. The mathematical formulation for a reversible reaction in a metabolic network of any size is derived as follows (Equation (21)): where s ± ij are absolute values of all positive and negative elements of stoichiometric matrix S. E j , denotes enzyme concentration in the reaction j in (mM). In the original formulation, where K M ji is the counterpart of Michaelis-Menten constants denoting the dissociation constant of enzyme j for the metabolite i (in mM). k cat ±j denote turnover rates in (s −1 ) for the reversible reaction of substrate-enzyme and product-enzyme complexes. Similar to Equation (20), h A and h I functions are introduced for the activation and inhibition effects from a metabolite (m) on the reaction j. k A jm and k I jm denote the dissociation constants for activation and inhibition modifiers [7,159].
W ± being regulation matrix, with w jm = 1, −1 or 0 denoting activation, inhibition or no regulatory action from the metabolite m on the enzyme j, respectively [7,159]. The kinetics formulation of reaction fluxes is the single most important factor that determines success of dynamic kinetic modelling. It is different from constraint-based models (CBMs), where the formulation of a hypothesized objective function determines the model performance [17,169]. Enforcing the Wegsheider condition constrains the equilibrium constant values of the reactions of the network and consequently provides relationships between the kinetic parameters, such relationship are employed to determine dependent kinetic parameters from the independent ones.

Cybernetic Metabolic Models
The cybernetic modelling approach developed by Ramkrishna and coworkers assumes a cybernetic basis for the regulation of metabolic system behaviour, thus, the pursuit of the metabolic goal provides computational prediction of metabolic dynamics modelled through cybernetic variables characterizing the control of enzyme synthesis and enzyme activity [194,195]. The metabolic goal formulated in the cybernetic kinetics differs from the objective function of CBMs discussed in Section 4, here, it is assumed that the cells respond to environmental changes by opting for investment on the synthesis and activity regulation of those enzymes, which result in the desired metabolic states faster than an alternative metabolic machinery would do [196,197]. Moreover, the assumed kinetic mechanisms differ from the conventions discussed in the previous subsections as it is not derived from an elementary description of enzymatic reactions, with the possible inclusion of the regulation on the single-reaction level (see Figure 7), instead the kinetic equations describe lumped reactions catalyzed by key enzymes and offer a global representation of complex regulatory circuits [198]. Particularly the earlier use of cybernetic kinetics was concerned with the growth dynamics prediction for the unicellular microorganisms growing on multiple substrates [195,197,199], in such applications the cybernetic model is unstructured, or containing few precursor pools that undergo metabolic transformations catalyzed by the key enzymes. However, in the more recent studies, the cybernetic modelling approach was extended in combination with the MPA concepts, such as EFMA, to circumvent the considerable dependence of cybernetic models on modellers' intuition to decompose metabolic networks into functional lumped metabolic routes [198], this variation of the cybernetic models are thus considered as notable examples of hybrid metabolic modelling [200,201]. Despite the effectiveness of the cybernetic approach for unicellular organisms, this framework is not sufficient yet to describe metabolic behaviour of the multicellular organisms, thus, researchers aim in the recent developments of the cybernetic modelling particularly to address this shortcoming by use of multiple objectives for biological systems [202].

Parameter Estimation Formulation
Possibility of constructing large-scale kinetic models significantly depends on the availability of physiologically valid kinetic parameters. The kinetic parameters for mechanistic models include enzyme concentration, turnover rate of the enzyme for both the reaction directions, dissociation constants for reactants bound to the enzyme, inhibition and activation constants and equilibrium constants, i.e., the standard Gibbs free energy of the enzymatic reaction. The resulting large and non-convex parametric space with undefined dimensions renders the estimation of parameters particularly challenging and intractable when the network size is enlarged [7,203,204]. For the isolated enzymatic reactions, the kinetic parameters are estimated directly through in vitro assays specifically designed to characterize particular enzymes, and the values are collected in the enzymes properties databases [203,205]. However, it is quite cost and labour intensive to characterize the majority of enzymes in the metabolic network of an organism, in comparable experiment conditions. In fact, it has been shown that enzyme characteristics have considerable dependencies to the thermodynamic state of the reaction solution. This means that the possible differences in pH, ionic strength, temperature and abundance of cofactors between the observation situation and the original measurement setup may cause the measured kinetic parameters to be incompatible or inappropriate for a certain model of the observation. In the parameter estimation of a metabolic network, thus, it is inevitable in most studies to implement an indirect estimation of parameters through minimization of the discrepancy between experimental data and model simulations. Yet provided data in databases can be used as initial guesses and/or to impose bounds on the parameters to be estimated [156,206]. Parameter estimation in the kinetic metabolic modelling can be formulated as an optimization problem subject to the context specific types of constraints [169,207,208]. When the problem is formulated, the estimation procedure starts from a set of initial guesses of the parameter values, which then are evaluated according to a quality measure such as fitness value of the comparison between model results and available data. Then, the parameter values are refined by scanning the parameter space in various directions until a satisfactory outcome is achieved, i.e., the parameters minimized an objective function which is usually the weighted sum of squared distances between model simulations and the associated experimental values. An optimization algorithm must be used to find the best direction, along which the parameter values are changed to calculate a reduced objective function value within the imposed constraints, consistent with the non-equilibrium thermodynamics of biosystem [159,[209][210][211][212][213]. Of importance, a mechanistic kinetic metabolic model includes a set of nonlinear functions of the kinetic parameters and consequently forms a nonlinear optimization problem for which linear (or mixed integer linear) programming optimization methods perform poorly. However, regardless of the implemented optimization algorithm, the workflow of parameter estimation can be summarized in the following global steps: 1. Objective function formulation: enhancements can be introduced by collection of the data in several duplicates to then adjust the relative weight of the error in states experimental values. 2. Constraints definition: bounds on parameter values should be introduced to keep them in the feasible biological ranges. 3. Algorithm and solver selection: nature of the optimization problem and available data play a significant role in this step. 4. Optimization option assignment: a set of options must be assigned for the solver.
These options are descriptive of optimization algorithm's details and accuracy.
Given the following generic objective function formulation (Equation (23)), countless methods have been employed to solve for a vector of parameters θ minimizing the function S(θ) for a metabolic network under study.
Evidently, the number of parameters increases with the size of the network, the extent of regulation details and the extent of interactions between the variables formulated by kinetic rate laws of the biochemical reactions. The wide array of optimization techniques search to find the minimum of S(k) in a constrained subspace of R p , where n is the number of free parameters, i.e., length of vector k if all parameters are independent. Mathematical formulation of the constraints is not trivial and demands an understanding of the biosystem. While global optimization algorithms are used to find the global minimum within the range of all inputs, local optimization methods are used to search for the local minima in the neighbourhood of a vector of initial guesses, i.e., a reference steady state of the biosystem. Global optimization methods are highly computationally demanding as opposed to the local optimization [152,156]. Optimization methods can be categorized into deterministic and stochastic algorithms. In a deterministic optimization, no random element appears and algorithms are based on mathematical scheduling, i.e., derivativebased algorithms. Conversely, in stochastic optimization, random searching procedures are applied to find a next step in the direction of extrema, in this manner it is more likely that the algorithm will not be trapped in local optimum [209,210,214]. Various studies confessed that there is not a single recipe to tackle the parameter estimation problem of metabolic modelling approaches [215][216][217], selection of an appropriate method depends on the type of formalism used to represent kinetic rate laws, the experimental data that is available to the modeller as well as the capacity of accessible computational frameworks [210]. In an ideal situation, a comprehensive set of collected data can be utilized to examine more than one complementary method of kinetic parameter estimation [216,218]. To reduce the computational cost of parameter estimation, several approaches have been proposed in literature [146], here we list the most popular ones: • To employ meta-heuristics optimization algorithms such as particle swarm [219] and genetic algorithm [220]. • To impose a reference state and estimate the biosystem parameters around this state [221]. • To introduce thermodynamic limitations and therefore limit the parameter solution space [222]. • To introduce local stability constraints [223]. • To reduce the model directly through the model reduction techniques (listed in [146]).
Drager et al. compared different evolutionary algorithms for the optimization of mixed models comprising CK and Michaelis-Menten kinetics in Corynebacterium glutamicum. They found that differential evolution (DE) and particle swarm optimization (PSO) resulted in the best parameters approximation, moreover, Tribes algorithm is useful for the first optimization attempts because of its performance and its user-friendly traits [159,224]. In a related study of Spieth et al. [225], evolution strategy (ES) and DE led to the best parameters estimation. Taken together these studies, it can be concluded that DE is an adequate estimation method for kinetic models [224]. DE is an accurate, reliable, robust and fast estimation algorithm which has few control parameters, and an easy to use and powerful search capability [226,227]. However, due to its fast convergence speed and low risk of divergence, it may get trapped within local minima. In addition, it is sensitive to its control parameters and if the population size increases, it becomes more computationally demanding [206,209,227,228]. On the other hand, authors in a recent study [212], developed a (deterministic) gradient-based kinetic parameter estimation algorithm that provides the best fit to multiple sets of metabolic data, i.e., fluxomics, metabolomics. A large metabolic network of E. coli (k-ecoli307) with 307 reactions and 258 metabolites is used to develop a kinetic model with 2367 kinetic parameters. The applicability of K-FIT is demonstrated in the estimation of the kinetic parameters by fitting the model to 13C-labelling data of multiple genetic perturbation mutants. It is reported that the algorithm works three orders of magnitude faster in comparison with meta-heuristics for estimating kinetic parameters of the test model [229]. A general workflow for kinetic parameter estimation is given in Figure 8.
As pointed out several times in this review, the modelling of metabolic networks is intrinsically an iterative effort. In Figure 9, we show schematically where the reviewed modelling approaches fit in modelling of cell metabolism.

Perspectives and Challenges Ahead
In this review, we assessed the mostly used modelling approaches of metabolic networks. The metabolic network was understood as a highly interconnected system of bioreactions, where various carbon sources are metabolized to produce energy and product molecular compounds. Then, the approaches to characterize the metabolic networks of distinct organisms was discussed. We showed that the in silico description of metabolic fluxes in large networks at steady state is performed successfully, by the modelling approaches based on the understanding that resulting outcome of the cell operation is optimized with regard to certain objectives. In fact, the mathematical formulation of the metabolic networks relying on the stoichiometric matrix of the network motivated several approaches collectively referred to as constrained-based models (CBMs). This modelling paradigm has been further established in recent years by the development of theoretical and computational extensions based on CBMs principles. Following the contributions to develop dFBA based on both static and dynamic optimization methods [40,141], efforts also focused on the integration of transcriptional regulation [230,231], and the consideration of resource allocations in terms of enzyme production cost [12,232]. Particularly in systems metabolic engineering of strains, rewarding applications of MPA methods fueled continual development of up-to-date mathematical modelling frameworks [233,234], recently finding optimal operating points in two-stage bioprocesses [235]. The challenges associated with the expansion of the CBMs can be related to the biological justification of the underlying assumptions and the efficiency of the computational methods employed to solve the core optimization problems. The CBMs collectively proved successful in complementing some particular types of interests including recombinant DNA technology [236] and optimal growth medium formulation [237] by finding answers to the questions regarding estimation of maximum theoretical yield, estimation of the growth rate of a certain strain, or identification of candidate genes for knockouts or gene overexpression. Typically, CBMs perform well as far as the objective of modelling practice is not to quantify how the intracellular fluxes are evoked and regulated by the activities of enzymes and what their response to perturbations are. Also, in more complex metabolic networks such as mammalian cells that are simultaneously growing on multiple carbon substrates, the applicability of the various constraint-based methods are still rather limited. The established principles and mathematical justification of constraint-based modelling and its successful application for metabolic engineering purposes will support its continuous enhancement to define the hypothetical best the cell can do with decreasing uncertainty and thus identifying what the cell is not capable of doing.
In a conceptually different modelling paradigm, we assessed the established modelling approaches to model flux kinetics of the regulated enzymatic reactions based on the hypothesis that changes in enzyme activities during the cell operation is mechanistically describable in terms of chemical activity of substances which regulate or determine enzyme activities. We reviewed MCA and BST as two founding theories for the systemic kinetics analysis of the biochemical reaction networks at (or close to) steady state, where the network responses to perturbations can be quantified and the distribution of flux control between several regulatory sites (i.e., enzymes) is appreciated. In general, kinetic modelling adds accuracy to the prediction of perturbation outcomes and gives flexibility to the simulation of different scenarios, mainly because it is rooted in the inclusion of the detailed mechanistic description of the regulatory and compensatory mechanisms of the cell by use of a wide array of kinetic formats. In addition to the metabolic fluxes, dynamic models can reflect intracellular metabolites concentration not only at steady state but also in the transient time [10,11,149]. It consequently makes dynamic models a better candidate in comparison with their constraint-based counterparts for the implementation of monitoring and control strategies in bioprocess management [18,19,85,86,148]. However, the hindering obstacles for all kinetic models to reach genome-scale are limited available data for intracellular concentrations and the complex procedure of selecting kinetic rate laws with identifiable mechanistic parameters. To circumvent, hybrid models have been proposed where for the part of network for which mechanistic data is available kinetic rate laws are used and the rest of the network is retained in its purely stoichiometric format. Observance of the emergent properties such as ultra-sensitivity (switch-like behaviour), bistability and oscillations which cannot be attributed to any single reaction or constituent of the network, but is only explainable with a systems understanding, further necessitates a dynamic nonlinear analysis of the underlying metabolic system. The importance of control and regulatory functional units inherent in a metabolic network can be hardly overestimated. In fact, after comparing Michaelis-Menten with approximate rate laws in deriving a kinetic model for Red Blood Cells (RBCs), Du et al. [8] concluded that it is best to construct "mechanistically detailed enzyme modules" whenever the available data on enzyme properties allows that. It is reported that mixed models defined with Michaelis-Menten equations for one substrate-one product reactions combined with CK has the best prediction capacity [159]. It is justified based on the drawbacks of Michaelis-Menten formalism as a sole representation for the reactions in a network of interconnected reactions compared with the system-level definition of CK. Also, CK formulates complex regulatory mechanisms independently and thus reduces the complexity of regulated Michaelis-Menten equation on the way of interpreting the underlying biological principles. In addition, the convenience kinetics rate law is easy to use for parameter estimation and optimization and its models have comparatively accurate approximations and results [7].
For the parameter estimation of dynamic kinetic models incorporating mechanistic parameters, several good enough practices have been reported in literature, i.e., evolutionary meta-heuristics algorithms. Particularly, when the gradient-based optimization methods appear to get trapped in suboptimal extrema, meta-heuristics optimization approaches are employed to circumvent the inherent nonlinearity of metabolic networks and the expansion of parameters in an exceeding number of dimensions. We envision that the development of kinetic models that can encompass as many as possible different data types for its fit, particularly enhances the parameter estimation quality. With nowadays expansion of quantitative omics data and increasing biochemical thermodynamics data, we envision that the successful metabolic modelling approaches of the future will help integrate more insight from these inputs in a consistent computational framework.
The appreciation for mathematical modelling and Systems Biology in general is growing rapidly, partially due to the promising prospects of mathematical metabolic models efficacy in a wide range of applications including drug target prediction and in-silico clinical trials [238]. Although it is yet unrealistic to adopt a full-scale patient-specific model for clinical trials, patient-specific metabolic models show potential for use in evaluation of medicinal products and devices, or in prediction of the outcome of medical interventions [239] by harnessing optimization technologies including artificial intelligence (AI) to dynamically and accurately determine the required treatment [240]. Thus, we envision mathematical metabolic models to become more involved in the inception and design of new practical technologies. In fact, the fast-paced emerges of genome sequencing, poses challenging questions such as: What does the complete set of pathways do? How do various cellular regulation mechanisms interact to determine high-producing and low-producing strains? [178] What does go wrong that a healthy organism becomes pathological? [241]. Finding the answer, demands vigorous complementary analytical and computational technologies to be developed, in addition to the pile of information from (reductionist) biological sciences. This cannot be done without a significant contribution, from the mathematical modelling approaches, in order to understand and simulate the relationships among genotype, phenotype, and environment of the cell.
Supplementary Materials: The following are available online at www.mdpi.com/xxx/s1, Text S1: Mathematical formulations of constraint-based models, Figure S1: EFMs enumeration for the running example with reversible reactions, Table S1: Properties of the different MPA methods, Table S2: The hypothesized objective functions of the cell and their biological rationales, Table S3: List of the flux balance analysis enhancements,