Dynamic Modeling of Cell-Free Biochemical Networks Using Effective Kinetic Models

Cell-free systems offer many advantages for the study, manipulation and modeling of metabolism compared to in vivo processes. Many of the challenges confronting genome-scale kinetic modeling can potentially be overcome in a cell-free system. For example, there is no complex transcriptional regulation to consider, transient metabolic measurements are easier to obtain, and we no longer have to consider cell growth. Thus, cell-free operation holds several significant advantages for model development, identification and validation. Theoretically, genome-scale cell-free kinetic models may be possible for industrially important organisms, such as E. coli, if a simple, tractable framework for integrating allosteric regulation with enzyme kinetics can be formulated. Toward this unmet need, we present an effective biochemical network modeling framework for building dynamic cell-free metabolic models. The key innovation of our approach is the integration of simple effective rules encoding complex allosteric regulation with traditional kinetic pathway modeling. We tested our approach by modeling the time evolution of several hypothetical cell-free metabolic networks. We found that simple effective rules, when integrated with traditional enzyme kinetic expressions, captured complex allosteric patterns such as ultrasensitivity or non-competitive inhibition in the absence of mechanistic information. Second, when integrated into network models, these rules captured classic regulatory patterns such as product-induced feedback inhibition. Lastly, we showed, at least for the network architectures considered here, that we could simultaneously estimate kinetic parameters and allosteric connectivity from synthetic data starting from an unbiased collection of possible allosteric structures using particle swarm optimization. However, Processes 2015, 3 139 when starting with an initial population that was heavily enriched with incorrect structures, our particle swarm approach could converge to an incorrect structure. While only an initial proof-of-concept, the framework presented here could be an important first step toward genome-scale cell-free kinetic modeling of the biosynthetic capacity of industrially important organisms.


Introduction
Mathematical modeling has long contributed to our understanding of metabolism.Decades before the genomics revolution, mechanistically, structured metabolic models arose from the desire to predict microbial phenotypes resulting from changes in intracellular or extracellular states [1].The single cell E. coli models of Shuler and coworkers pioneered the construction of large-scale, dynamic metabolic models that incorporated multiple, regulated catabolic and anabolic pathways constrained by experimentally determined kinetic parameters [2].Shuler and coworkers generated many single cell kinetic models, including single cell models of eukaryotes [3,4], minimal cell architectures [5], as well as DNA sequence based whole-cell models of E. coli [6].Conversely, highly abstracted kinetic frameworks, such as the cybernetic framework, represented a paradigm shift, viewing cells as growth-optimizing strategists [7].Cybernetic models have been highly successful at predicting metabolic choice behavior, e.g., diauxie behavior [8], steady-state multiplicity [9], as well as the cellular response to metabolic engineering modifications [10].Unfortunately, traditional, fully structured cybernetic models also suffer from an identifiability challenge, as both the kinetic parameters and an abstracted model of cellular objectives must be estimated simultaneously.However, recent cybernetic formulations from Ramkrishna and colleagues have successfully treated this identifiability challenge through elementary mode reduction [11,12].
In the post genomics world, large-scale stoichiometric reconstructions of microbial metabolism popularized by static, constraint-based modeling techniques such as flux balance analysis (FBA) have become standard tools [13].Since the first genome-scale stoichiometric model of E. coli, developed by Edwards and Palsson [14], well over 100 organisms, including industrially important prokaryotes such as E. coli [15] or B. subtilis [16], are now available [17].Stoichiometric models rely on a pseudo-steady-state assumption to reduce unidentifiable genome-scale kinetic models to an underdetermined linear algebraic system, which can be solved efficiently even for large systems.Traditionally, stoichiometric models have also neglected explicit descriptions of metabolic regulation and control mechanisms, instead opting to describe the choice of pathways by prescribing an objective function on metabolism.Interestingly, similar to early cybernetic models, the most common metabolic objective function has been the optimization of biomass formation [18], although other metabolic objectives have also been estimated [19].Recent advances in constraint-based modeling have overcome the early shortcomings of the platform, including capturing metabolic regulation and control [20].
Thus, modern constraint-based approaches have proven extremely useful in the discovery of metabolic engineering strategies and represent the state of the art in metabolic modeling [21,22].However, genome-scale kinetic models of industrial important organisms such as E. coli have yet to be constructed.
Cell-free systems offer many advantages for the study, manipulation and modeling of metabolism compared to in vivo processes.Central amongst these advantages is direct access to metabolites and the microbial biosynthetic machinery without the interference of a cell wall.This allows us to control as well as interrogate the chemical environment while the biosynthetic machinery is operating, potentially at a fine time resolution.Second, cell-free systems also allow us to study biological processes without the complications associated with cell growth.Cell-free protein synthesis (CFPS) systems are arguably the most prominent examples of cell-free systems used today [23].However, CFPS is not new; CFPS in crude E. coli extracts has been used since the 1960s to explore fundamentally important biological mechanisms [24,25].Today, cell-free systems are used in a variety of applications ranging from therapeutic protein production [26] to synthetic biology [27].Interestingly, many of the challenges confronting genome-scale kinetic modeling can potentially be overcome in a cell-free system.For example, there is no complex transcriptional regulation to consider, transient metabolic measurements are easier to obtain, and we no longer have to consider cell growth.Thus, cell-free operation holds several significant advantages for model development, identification and validation.Theoretically, genome-scale cell-free kinetic models may be possible for industrially important organisms, such as E. coli or B. subtilis, if a simple, tractable framework for integrating allosteric regulation with enzyme kinetics can be formulated.
In this study, we present an effective biochemical network modeling framework for building dynamic cell-free metabolic models.The key innovation of our approach is the seamless integration of simple effective rules encoding complex regulation with traditional kinetic pathway modeling.This integration allows the description of complex regulatory interactions, such as time-dependent allosteric regulation of enzyme activity, in the absence of specific mechanistic information.The regulatory rules are easy to understand, easy to formulate and do not rely on overarching theoretical abstractions or restrictive assumptions.We tested our approach by modeling the time evolution of several hypothetical cell-free metabolic networks.In particular, we tested whether our effective modeling approach could describe classically expected enzyme kinetic behavior, and second whether we could simultaneously estimate kinetic parameters and regulatory connectivity, in the absence of specific mechanistic knowledge, from synthetic experimental data.Toward these questions, we explored five hypothetical cell-free networks.Each network shared the same enzymatic connectivity, but had different allosteric regulatory connectivity.We found that simple effective rules, when integrated with traditional enzyme kinetic expressions, captured complex allosteric patterns such as ultrasensitivity or non-competitive inhibition in the absence of mechanistic information.Second, when integrated into network models, these rules captured classical regulatory patterns such as product-induced feedback inhibition.Lastly, we showed, at least for the network architectures considered here, that we could simultaneously estimate kinetic parameters and allosteric connectivity from synthetic data starting from an unbiased collection of possible allosteric structures using particle swarm optimization.However, when starting with an initial population that was heavily enriched with incorrect structures, our particle swarm approach could converge to an incorrect structure.While only an initial proof-of-concept, the framework presented here could be an important first step toward genome-scale cell-free kinetic modeling of the biosynthetic capacity of industrially important organisms.

Formulation and Properties of Effective Cell-Free Metabolic Models
We developed two proof-of-concept metabolic networks to investigate the features of our effective biochemical network modeling approach (Figure 1).In both examples, substrate S was converted to the end products P 1 and P 2 through a series of enzymatically catalyzed reactions, including a branch point at hypothetical metabolite M 2 .Several of these reactions involved cofactor dependence (AH or A), and various allosteric regulatory mechanisms modified the activity of pathway enzymes.Network A included feedback inhibition of the initial pathway enzyme (E 1 ) by pathway end products P 1 and P 2 (Figure 1A).On the other hand, network B involved feedback inhibition of E 1 by P 2 and E 6 by P 1 (Figure 1B).In both networks, branch point enzymes E 3 and E 6 were subject to feed-forward activation by reduced cofactor AH.Lastly, it is known experimentally that cell-free systems have a finite operational lifespan.Loss of biosynthetic capability could be a function of many factors, e.g., cofactor or metabolite limitations.We modeled the loss of biosynthetic capability as a non-specific first-order decay of enzyme activity.Proof-of-concept cell-free metabolic networks considered in this study.Substrate S is converted to products P 1 and P 2 through a series of chemical conversions catalyzed by enzyme(s) E j .The activity of the pathway enzymes is subject to both positive and negative allosteric regulation.
Allosteric regulation of enzyme activity was modeled by combining individual regulatory contributions to the activity of pathway enzymes into a control coefficient using an integration rule (Figure 2).This strategy is similar in spirit to the Constrained Fuzzy Logic (cFL) approach of Lauffenburger and coworkers which has been used to effectively model signal transduction pathways important in human health [28].In our formulation, Hill-like transfer functions 0 ≤ f (Z) ≤ 1 were used to calculate the influence of factor abundance upon target enzyme activity.In this context, factors can be individual metabolite levels or some function, e.g., the product of metabolite levels.However, more generally, factors can also correspond to non-modeled influences, categorial variables or other abstract quantities.In the current study, we simply let Z correspond to the abundance of individual metabolites, however in general this can be a complex function of both modeled and unmodeled factors.When an enzyme was potentially sensitive to more than one regulatory input, logical integration rules were used to select which regulatory transfer function influenced enzyme activity at any given time.Thus, our test networks involved important features such as cofactor recycling, enzyme activity and metabolite dynamics, as well as multiple overlapping allosteric regulatory mechanisms.The rule-based regulatory strategy approximated the behavior of classical allosteric activation and inhibition mechanisms (Figure 3).We considered the enzyme catalyzed conversion of substrate S to a product P, where the overall reaction rate was modeled as the product of a Michaelis-Menten term and an effective allosteric control variable reflecting the particular regulatory interaction.We first explored feed-forward substrate activation of enzyme activity (for both positive and negative cooperativity).Consistent with classical data, the rule-based strategy predicted a sigmoidal relationship between substrate abundance and reaction rate as a function of the cooperativity parameter (Figure 3A).For cooperativity parameters less than unity, increased substrate abundance decreased the maximum reaction rate.This was consistent with the idea that substrate binding decreased at regulatory sites, which negatively impacted substrate binding at the active site.On the other hand, as the cooperativity parameter increased past unity, the rate of conversion of substrate S to product P by enzyme E approached a step function.In the presence of an inhibitor, the rule-based strategy predicted non-competitive like behavior as a function of the cooperativity parameter (Figure 3B).When the control gain parameter, κ ij in Equaion (10), was greater than unity, the inhibitory force was directly proportional to the cooperativity parameter, η in Equation (10).Thus, as the cooperativity parameter increased, the maximum reaction rate decreased (Figure 3B).Interestingly, our rule-based approach was unable to directly simulate competitive inhibition of enzyme activity.Taken together, the rule-based strategy captured classical regulatory patterns for both enzyme activation and inhibition.Thus, we are able to model complex kinetic phenomena such as ultrasensitivity, despite an effective description of reaction kinetics.

A B
κcontrol > 1 End product yield was controlled by feedback inhibition, while product selectivity was controlled by branch point enzyme inhibition (Figure 4).A critical test of our modeling approach was to simulate networks with known behavior.If we cannot reproduce the expected behavior of simple networks, then our effective modeling strategy, and particularly the rule-based approximation of allosteric regulation, will not be feasible for genome-scale cell-free problems.We considered two cases, control ON/OFF, for each network configuration.Each of these cases had identical kinetic parameters and initial conditions; the only differences between the cases were the allosteric regulation rules and the control parameters associated with these rules.As expected, end product accumulation was larger for network A when the control was OFF (no feedback inhibition of E 1 by P 1 and P 2 ), as compared to the ON case (Figure 4A).We found this behavior was robust to the choice of underlying kinetic parameters, as we observed that same qualitative response across an ensemble of 100 randomized parameter sets, for fixed control parameters.The control ON/OFF response of network B was more subtle.In the OFF case, the behavior was qualitatively similar to network A. However, for the ON case, flux was diverted away from P 2 formation by feedback inhibition of E 6 activity at the M 2 branch point by P 1 (Figure 4B).Lower E 6 activity at the M 2 branch point allowed more flux toward P 1 formation, hence the yield of P 1 also increased (Figure 4C).Again, the control ON/OFF behavior of network B was robust to changes in kinetic parameters, as the same qualitative trend was conserved across an ensemble of 100 randomized parameters, for fixed control parameters.Taken together, these simulations suggested that the rule-based allosteric control concept could robustly capture expected feedback behavior for networks with uncertain kinetic parameters.

Estimating Parameters and Effective Allosteric Regulatory Structures
A critical challenge for any dynamic model is the estimation of kinetic parameters.For metabolic processes, there is also the added challenge of identifying the regulation and control structures that manage metabolism.Of course, these issues are not independent; any description of enzyme activity regulation will be a function of system state, which in turn depends upon the kinetic parameters.For cell-free systems, regulated gene expression has been removed, however, enzyme activity regulation is still operational.We explored this linkage by estimating model parameters from synthetic data using both network structures.We generated synthetic measurements of the substrate S, intermediate M 5 and end product P 1 approximately every 20 min using network A. This data set is similar to published cell-free studies, both in terms of network coverage and sampling frequency [23].We then generated an ensemble of model parameter estimates by minimizing the difference between model simulations and the synthetic data using particle swarm optimization (PSO), starting from random initial parameter guesses.The estimation of kinetic parameters was sensitive to the choice of regulatory structure (Figure 5).PSO identified an ensemble of parameters that bracketed the mean of the synthetic measurements in less than 1000 iterations when the control structure was correct (Figure 5A,B).However, with control mismatch (network B simulated with network A parameters), model simulations were not consistent with the synthetic data (Figure 5C,D).Taken together, these results suggested that we could perhaps simultaneously estimate both parameters and network control architectures, as incorrect control structures would be manifest as poor model fits. .Parameter estimation from synthetic data for the same and mismatched allosteric control logic using particle swarm optimization (PSO).Synthetic experimental data was generated from a hypothetical parameter set using Network A, where substrate S, end product P 1 and intermediate M 5 were sampled approximately every 20 min.For cases (A,B) 20 particles were initialized with randomized parameters and allowed to search for 300 iterations.(A,B) PSO estimated an ensemble of 20 parameters sets consistent with the synthetic experimental data assuming the correct enzymatic and control connectivity starting from randomized initial parameters; (C,D) In the presence of control mismatch (Network B control policy simulated with Network A kinetic parameters) the ensemble of models did not describe the synthetic data.The synthetic data plotted here was unperturbed by noise.However, we assumed a constant coefficient of variation of 10% for the synthetic data during parameter estimation.
We modified our particle swarm identification strategy to simultaneously search over both kinetic parameters and putative control structures.In addition to our initial networks, we constructed three additional presumptive network models, each with the same enzymatic connectivity but different allosteric regulation of the pathway enzymes (Figure 6).We then initialized a population of particles, each with one of the five potential regulatory programs and randomized kinetic parameters.Thus, we generated an initial population of particles that had both different kinetic parameters as well as different control structures.We biased the distribution of the particle population according to our a prior belief of the correct regulatory program.To this end, we considered three different priors, a uniform distribution where each putative regulatory structure represented 20% of the population and two mixed distributions that were either positively or negatively biased towards the correct structure (network A).In both the positively biased and uniform cases the PSO clearly differentiated between the true or closely related structures and those that were materially different (Figure 7).As expected, the positively biased population (40% of the initial particle population seeded with network A) gave the best results, where the correct structure was preferentially identified (Figure 7A).On the other hand, when given a uniform distribution, the PSO approach identified a combination of network A and network C as the most likely control structures (Figure 7B).Network A and C differ by the regulatory connection between the end product P 2 and enzyme E 1 ; in network A, end product P 2 was assumed to inhibit E 1 , while in network C, end product P 2 activated E 1 .Lastly, when the initial population was heavily biased towards incorrect structures (initial population seeded with 90% incorrect structures), the particle swarm misidentified the correct allosteric structure (Figure 7C).Interestingly, while each particle swarm identified parameter sets that minimized the simulation error, the estimated parameter values were not necessarily similar to the true parameters.The angle between the estimated and true parameters was not consistently small across the swarms (identical parameters would give an angle of zero).This suggested that our particle swarm approach identified a sloppy ensemble, i.e., parameter estimates that were individually incorrect but collectively exhibited the correct model behavior. .Metabolic flux and control variables as a function of network type and particle index at t = 100 min.The particle error, the control variables governing E 1 , E 3 and E 6 activity (v 1 , v 3 and v 3 ) and the scaled metabolic flux were calculated for the positively (top), uniformly (middle) and negatively (bottom) biased particle swarms (N = 100).Blue denotes a low value, while red denotes a high value for the respective quantity being plotted.The particles from each swarm were sorted based upon simulation error (low to high error).(A) Model performance for the positively biased particle swarm as a function of particle index; (B) Model performance for the uniformly biased particle swarm as a function of particle index; (C) Model performance for the negatively biased particle swarm as a function of particle index.Models with significant control mismatch showed distinct control and flux patterns versus those models with the correct or closely related control policies.In particular, models with the correct control policy showed stronger inhibition of E 1 activity, leading to decreased flux from S→P 1 .Conversely, models with significant mismatch had increased E 1 activity, leading to an altered flux distribution.This is especially apparent in the negatively biased particle swarm.
We calculated control program output and scaled metabolic flux for the positively, uniformly and negatively biased particle swarms (Figure 8).Network A and network C models from the positively (Figure 8A) and uniformly (Figure 8B) biased particle swarms showed similar operational patterns, despite differences in kinetic parameters and control structures.While models from the negatively biased population had error values similar to the correct structures in the previous swarms, they have different flux and control profiles (Figure 8C).In all cases, regardless of network configuration or parameter values, the rate of enzyme decay was small compared to the other fluxes, and all networks had qualitatively similar trends for E 3 and E 6 control.Moreover, consistent with the correct model structure, production of end product P 1 was the preferred branch for all model configurations.However, there was variability in P 2 production flux across the population of models, especially for the uniform swarm when compared with the other cases.High P 1 branch flux resulted in end product inhibition of E 1 in both network A and network C, however in network D and E, high P 1 flux induced E 1 activation.These trends were manifested in different flux profiles, where the negatively biased population appeared more uniform across the population compared with the other swarms, and had higher E 1 specific activity.Interestingly, the behavior of network A and network C highlighted an artifact of our integration rule; both a positive or negative feedback connection from P 2 to E 1 were ignored because the P 1 inhibition of E 1 dominated.Thus, while theoretically distinct, network A and network C appeared operationally to the PSO algorithm to be the same network.On the other hand, networks B, D and E showed distinct behavior that was not consistent with the true network.These architectures exhibited either limited inhibition (network B) or activation (network D and E) of E 1 activity, resulting in significantly different metabolic flux profiles.However, the PSO was able to find low error parameter solutions, despite the mismatch in the control structures (error values similar, but not better than the best network A and network C estimates).Taken together, these results suggested that a uniform sampling approach could potentially yield an unbiassed estimate of both kinetic parameters and control structures.However, the negatively biased particle swarm results illustrated a potential shortcoming of the approach, namely convergence to a local error minimum despite a significantly incorrect control structure.This suggested that estimated model structures will need to be further evaluated, for example by generating falsifiable experimental designs which could distinguish between low error solutions.

Discussion
In this study, we presented an effective kinetic modeling strategy to dynamically simulate cell-free biochemical networks.Our proposed strategy integrated traditional kinetic modeling with an effective rules based approach to dynamically describe metabolic regulation and control.We tested this approach by developing kinetic models of hypothetical cell-free metabolic networks.In particular, we tested whether our effective modeling approach could describe classically expected behavior, and second whether we could simultaneously estimate kinetic parameters and regulatory connectivity, in the absence of specific mechanistic knowledge, from synthetic experimental data.Toward these questions, we explored five hypothetical cell-free networks.In each network, a substrate S was converted to the end products P 1 and P 2 through a series of enzymatically catalyzed reactions, including a branch point at a hypothetical metabolite M 2 .Each network also included the same cofactors and cofactor recycle architecture.However, while all five networks shared the same enzymatic connectivity, each had different allosteric regulatory connectivity.We found that simple effective rules, when integrated with traditional enzyme kinetic expressions, could capture complex allosteric patterns such as ultrasensitivity, or non-competitive inhibition in the absence of specific mechanistic information.
Moreover, when integrated into network models, these rules captured classical regulatory patterns such as product-induced feedback inhibition.Lastly, we simultaneously estimated kinetic parameters and discriminated between competing regulatory structures, using synthetic data in combination with a modified particle swarm approach.If we considered all putative regulatory architectures to be equally likely, we were able to estimate a sloppy ensemble of models with the correct architecture and kinetic parameters.Thus, we identified parameter values that were different from their true values, but nonetheless produced reasonable model performance (low error).This suggested that we captured important parameter combinations (stiff combinations), while simultaneously missing other parameter combinations (sloppy combinations).This was similar to the earlier study of Brown and Sethna [29], which showed that reasonable model predictions were possible, despite sometimes only order of magnitude parameter estimates, if the stiff parameter combinations were well constrained.
The proposed modeling strategy shares features with other popular techniques, but also has several key differences.At its core, our effective modeling approach is similar to regulatory constraint-based methods, and to the cybernetic modeling paradigm developed by Ramkrishna and colleagues.Covert, Palsson and coworkers drastically improved the predictability of constraint-based approaches by integrating Boolean rules into the calculation of metabolic fluxes [30].If the regulated intracellular flux problem is coupled with time-dependent extracellular balances, these models can predict complex behavior such as diauxie growth or the switch between aerobic and anaerobic metabolism.Another important feature of this approach is that it scales with biological complexity.For example, Covert et al.
showed that a genome-scale model of E. coli augmented with a Boolean rule layer, correctly predicted approximately 80% of the outcomes of a high-throughput growth phenotyping experiment in E. coli.Further, they showed that they could learn new biology by iteratively refining the model and its associated rules [31].However, while regulated flux balance analysis is a powerful technique, it does not easily allow the calculation of time-resolved metabolite abundance.Additionally, the Boolean rules which populate the regulatory layer are limited to ON/OFF decisions; for qualitative predictions of gene expression this is a reasonable limitation.However, Boolean rules will likely be less effective at capturing dynamic allosteric regulation in a cell-free metabolic system.On the other hand, the strength of cybernetic models is the integration of optimal metabolic control strategies with traditional kinetic pathway modeling.Cybernetic models are highly predictive; they have successfully predicted mutant behavior from limited wild-type data [10,32,33], steady-state multiplicity [9], strain specific metabolic function [12] and have been used in bioprocess control applications [34].However, cybernetic control strategies are not mechanistic, instead they are the output of an optimal decision with respect to a set of hypothetical physiological objectives.Thus, they are abstractions which are difficult to translate into a specific biological mechanism.Our approach addresses the shortcomings of both regulatory constraint-based models and cybernetic models.First, similar to cybernetic models, the core of our approach is a kinetic model.Thus, we are able to directly calculate the time evolution of metabolism, for example the dynamic abundance of network metabolites.Second, similar to regulatory flux balance analysis, our control laws describe specific mechanistic motifs, such as activation or inhibition of enzyme activity.However, our rules are continuous, thus they potentially allow a finer grained description of metabolic regulation and control mechanisms.Lastly, we can naturally incorporate unmodeled factors and categorical factors or combinations thereof into our control law formulations.Though requiring a more complex description of cellular metabolism, our approach may even be extended to simulate cell-based systems by incorporating the same control laws into transcription factor activation and gene expression regulation.
The proposed modeling framework also differs appreciably from previously established kinetic approximations of complex biochemical network behavior.Such frameworks replace parameter dense mechanistic kinetic expressions with heuristics quantifying the relationship between metabolic rate and metabolite effectors.A review of approximative kinetic formats can be found in [35].These approaches arose in response to uncertainties associated with obtaining correct mechanistic kinetic expressions and parameters of in vivo systems.Similarly, available kinetic parameters measured in vitro may differ in a specialized cell-free in vitro environments.Factors affecting kinetics, such as enzyme channeling, macromolecular crowding, and pH, are likely dramatically different in cell-free environments than in both in vivo systems as well as typical in vitro conditions used for parameter measurements.Thus, a more generalized, approximate biochemical reaction network formulation may be desirable in the case of cell-free systems.Our approach is similar to generalized mass action-based power law formulations of Savageau and colleagues [36] and linlog kinetics of Visser et al. [37] in that metabolic rates are proportional to corresponding enzyme levels modified by metabolite effectors.Power law and linlog approaches suffer from several limitations [35].Power law reaction rates do not capture saturation effects and become infinite for small concentrations of inhibitory regulators.Linlog kinetics also become ill-defined when effector concentrations go to zero.Also, models employing linlog kinetics typically rely on an experimentally determined reference state to describe dynamics taking place after a perturbation to a steady-state.Our framework does not suffer from such drawbacks.Moreover, cell-free systems are unlikely to satisfy such a steady-state approximation after extract preparation and prior to culture initiation.Our framework is similar to the generic kinetic formulation from Hadlich et al. [38], but differs in its inclusion of cooperative effects as well as proposes a simplified integration of competition amongst allosteric effectors using max/min rules.In summary, our proposed framework offers an effective kinetic approximation that captures saturation effects and allosteric competition within cell-free systems that may also be extensible to in vivo metabolic and gene regulatory networks.
There are several critical questions that should be explored following this proof-of-concept study.It is unclear how parameter identification will scale to genome-scale networks, and second it is unclear how we will identify allosteric connectivity at a genome-scale.The enzymatic connectivity for genome-scale cell-free networks can easily be established by stripping away the growth and cell wall machinery from whole cell genome reconstructions.Then metabolic fluxes can be transformed into kinetic expressions using heuristics such multiple saturation kinetics, which are then modified by our rule-based control variables.This leaves a large number of unknown kinetic constants that must be estimated from time-resolved metabolite measurements.Ensemble modeling is a well-established approach for parameter identification in large-scale deterministic models.Liao and coworkers developed a method that generates an ensemble of kinetic models that all approach the same steady-state, one determined by fluxomics measurements [39].The best subpopulation of candidate models were selected based on their agreement with further measurements of genetically perturbed systems.Our work relies on heuristic search optimization to identify kinetic models consistent with steady-state and dynamic time-series measurements of cellular species [40][41][42][43][44][45].Instead of estimating a single yet highly uncertain parameter set, both approaches estimate an ensemble of parameter sets whose model behavior recapitulates experimental measurements.Here, we showed that particle swarm optimization quickly identified an ensemble of model parameters, at least for proof-of-concept metabolic networks using synthetic data.This suggested that we can expect reasonable model predictions, despite only partial parameter knowledge, as network size grows if we have properly designed experiments.Though we expect computational complexity will scale poorly with network size, we are optimistic that large-scale, predictive models of metabolism are possible.There is evidence to suggest that achieving a quantitative understanding of complex biological systems should not require complete parametric knowledge.Brown and Sethna showed in a model of signal transduction that good predictions were possible despite only order of magnitude estimates of parameter values [29].Sethna and coworkers later showed that model performance is often controlled by only a few parameter combinations, a characteristic seemingly universal to multi-parameter models referred to as sloppiness [46].We have also demonstrated sloppy behavior in a wide variety of signal transduction processes [40][41][42][43][44][45].Thus, given our previous experience with models containing hundreds of unknown parameters, we expect parameter estimation to be a manageable challenge assuming we have good quality experimental data.
A second critical challenge will be the estimation of allosteric connectivity at a genome scale.The regulation of glycolytic enzymes, such as phosphofructokinase I, has been studied for many years [47,48].The allosteric regulation of metabolic enzymes can also be established from organism specific databases, such as EcoCyc [49], or more general allosteric databases, such as the AlloSteric Database [50].However, for those enzymes that have not been well studied, we will need to infer allosteric interactions from experimental data.In general, the reverse engineering of regulatory network structure from data is a difficult problem.Recently, Sauer and colleagues have developed a systematic, model-based approach for the identification of allosteric regulation in vivo [51].They tested the effects of many putative allosteric protein-metabolite interactions on the performance of a kinetic model of glycolysis against dynamic metabolomic and fluxomic measurements.A method similar to this may be easily applied to cell-free systems in order to identify relevant in vitro allosteric interactions.Because omics measurements of cell-free environments are easy to obtain, identification of large-scale allosteric control structures may be possible.Also, there are many different approaches from the reverse engineering of gene regulatory networks that perhaps could be adopted to this problem, however this remains an open question.For example, one could imagine designing pulse chase experiments which maximally distinguish between competing allosteric models, similar to the earlier work of Kremling et al. [52], or iteratively estimating model structures similar to Doyle and coworkers [53].Lastly, the choice of max/min integration rules or the particular form of the transfer functions could be generalized to include other rule types and functions.Theoretically, an integration rule is a function whose domain is a set of transfer function inputs, and whose range is v ∈ [0, 1].Thus, integration rules other than max/min could be used, such as the mean or the product, assuming the range of the transfer functions is always f ∈ [0, 1].Alternative integration rules such as the mean might have different properties which could influence model identification or performance.For example, a mean integration rule would be differentiable, which allows derivative-based optimization approaches to be used.The particular form of the transfer function could also be explored.We choose a Hill-like function because of its prominence in the systems and synthetic biology community.However, the only mathematical requirement for a transfer function is that it map a non-negative continuous or categorical variable into the range f ∈ [0, 1].Thus, many types of transfer functions are possible.

Formulation and Solution of the Model Equations
We used ordinary differential equations (ODEs) to model the time evolution of metabolite (x i ) and scaled enzyme abundance ( i ) in hypothetical cell-free metabolic networks: where R denotes the number of reactions, M denotes the number of metabolites and E denotes the number of enzymes in the model.The quantity r j (x, , k) denotes the rate of reaction j.Typically, reaction j is a non-linear function of metabolite and enzyme abundance, as well as unknown kinetic parameters k (K × 1).The quantity σ ij denotes the stoichiometric coefficient for species i in reaction j.
If σ ij > 0, metabolite i is produced by reaction j.Conversely, if σ ij < 0, metabolite i is consumed by reaction j, while σ ij = 0 indicates metabolite i is not connected with reaction j.Lastly, λ i denotes the scaled enzyme degradation constant.The system material balances were subject to the initial conditions x (t o ) = x o and (t o ) = 1 (initially we have 100% cell-free enzyme abundance).Each reaction rate was written as the product of two terms, a kinetic term (r j ) and a regulatory term (v j ): We used multiple saturation kinetics to model the reaction term rj : where k max j denotes the maximum rate for reaction j, i denotes the scaled enzyme activity which catalyzes reaction j, and K js denotes the saturation constant for species s in reaction j.The product in Equaion (4) was carried out over the set of reactants for reaction j (denoted as m − j ).The allosteric regulation term v j depended upon the combination of factors which influenced the activity of enzyme i.For each enzyme, we used a rule-based approach to select from competing control factors (Figure 2).If an enzyme was activated by m metabolites, we modeled this activation as: where 0 ≤ f ij (Z) ≤ 1 was a regulatory transfer function that calculated the influence of metabolite i on the activity of enzyme j.Conversely, if enzyme activity was inhibited by a m metabolites, we modeling this inhibition as: Lastly, if an enzyme had both m activating and n inhibitory factors, we modeled the regulatory term as: v j = min (u j , d j ) where: The quantities j + and j − denoted the sets of activating and inhibitory factors for enzyme j.If an enzyme had no allosteric factors, we set v j = 1.There are many possible functional forms for 0 ≤ f ij (Z) ≤ 1.However, in this study, each individual transfer function took the form: where Z j denotes the abundance of the j factor (e.g., metabolite abundance), and κ ij and η are control parameters.The κ ij parameter represents a species gain parameter, while η is a cooperativity parameter (similar to a Hill coefficient).In the case η > 1, the allosteric interaction displays positive cooperativity.For η < 1, the interaction is negatively cooperative.Finally, if η = 1, the interaction displays no cooperativity.The effect of different values of η on reaction rate can be seen in Figure 3.The model equations were encoded using the Octave programming language and solved using the LSODE routine in Octave [54].In some cases, metabolic fluxes (or other quantities) were scaled according to: rj (t = τ ) = r j − min r max r − min r t=τ (11) where 0 ≤ rj (t = τ ) ≤ 1 denotes the scaled value for flux j evaluated at time τ .We have used this scaling in a variety of other contexts [45,55].
Estimation of model parameters and structures from synthetic experimental data.Model parameters were estimated by minimizing the difference between simulations and synthetic experimental data (squared residual): where xj (τ ) denotes the measured value of species j at time τ , x j (τ, k) denotes the simulated value for species j at time τ , and ω j (τ ) denotes the experimental measurement variance for species j at time τ .The outer summation is respect to time, while the inner summation is with respect to state.
We approximated a realistic model identification scenario, assuming noisy experimental data, limited sampling resolution (approximately 20 min per sample) and a limited number of measurable metabolites.We assumed a constant coefficient of variation of 10% for the synthetic data set.We minimized the model residual using particle swarm optimization (PSO) [56].PSO uses a swarming metaheuristic to explore parameter spaces.A strength of PSO is its ability to find the global minimum, even in the presence of potentially many local minima, by communicating the local error landscape experienced by each particle collectively to the swarm.Thus, PSO acts both as a local and a global search algorithm.For each iteration, particles in the swarm compute their local error by evaluating the model equations using their specific parameter vector realization.From each of these local points, a globally best error is identified.Both the local and global error are then used to update the parameter estimates of each particle using the rules: where ∆ i denotes the perturbation to the vector of parameters k i for particle i. (θ 1 , θ 2 , θ 3 ) are adjustable parameters, L i denotes the best local solution found by particle i, and G denotes the best solution found over the entire population of particles.The quantities r 1 and r 2 denote uniform random vectors with the same dimension as the number of unknown model parameters (K × 1).In this study, we used (θ 1 , θ 2 , θ 3 ) = (1.0,0.05564, 0.02886).The quality of parameter estimates was measured using two criteria, goodness of fit (model residual) and angle between the estimated parameter vector k j and the true parameter set k * : If the candidate parameter set k j were perfect, the residual between the model and synthetic data and the angle between k j and the true parameter set k * would be equal to zero.
We modified our PSO implementation to simultaneously search over kinetic parameters and putative model control structures.In the combined case, each particle potentially carried a different model realization in addition to a different kinetic parameter vector.We kept the update rules the same (along with the update parameters).Thus, each particle competed on the basis of goodness of fit, which allowed different model structures to contribute to the overall behavior of the swarm.We considered five possible model structures (A through E), where network A was the correct formulation (used to generate the synthetic data).We considered a population of 100 particles, where each particle in the swarm was assigned a model structure, and a random parameter vector.The optimization simulations shown in Figure 7 required several hours to complete on a single CPU Apple workstation (Apple, Cupertino, CA, USA; OS X v10.10).The PSO algorithm, model equations, and the objective function were encoded and solved in the Octave programming language [54].

Figure 1 .
Figure1.Proof-of-concept cell-free metabolic networks considered in this study.Substrate S is converted to products P 1 and P 2 through a series of chemical conversions catalyzed by enzyme(s) E j .The activity of the pathway enzymes is subject to both positive and negative allosteric regulation.

Figure 2 .
Figure 2. Schematic of rule-based allosteric enzyme activity control laws.Traditional enzyme kinetic expressions, e.g., Michaelis-Menten or multiple saturation kinetics, are multiplied by an enzyme activity control variable 0 ≤ v j ≤ 1.Control variables are functions of many possible regulatory factors encoded by arbitrary functions of the form 0 ≤ f j (Z) ≤ 1.At each simulation time step, the v j variables are calculated by evaluating integration rules such as the max or min of the set of factors f 1 , . . .influencing the activity of enzyme E j .

Figure 3 .
Figure 3. Kinetics of simple transformations in the presence of activation and inhibition.(A) The conversion of substrate S to product P by enzyme E was activated by S.For a fixed control gain parameter κ control , the reaction rate approached a step for increasing cooperativity control parameter η.For activation simulations κ control = 0.05 and η = {0.01,0.1, 1, 2, 4, 6, 8, 10}; (B) The conversion of substrate S to product P by enzyme E with inhibitor I.For a fixed control gain parameter κ control , the reaction rate approximated non-competitive inhibition for increasing cooperativity control parameter η.For the inhibition simulations κ control = 1.5 and η = {0.01,0.1, 1, 2, 4, 6, 8, 10}.

Figure 4 .Figure 5
Figure 4. ON/OFF control simulations for Network A and Network B for an ensemble of 100 kinetic parameter sets versus time.For each case, simulations were conducted using kinetic and initial conditions generated randomly from a hypothetical true parameter set.The gray area represents ± one standard deviation surrounding the mean.Control parameters were fixed during the ensemble calculations.(A) End product P 1 abundance versus time for Network A. The abundance of P 1 decreased with end product inhibition of E 1 activity (Control-ON) versus the no inhibition case (Control-OFF); (B) End product P 2 abundance versus time for Network B. Inhibition of branch point E 6 by end product P 1 decreased P 2 abundance (Control-ON) versus the no inhibition case (Control-OFF); (C) End product P 1 abundance versus time for Network A. Inhibition of branch point E 6 by end product P 1 decreased P 1 abundance (Control-ON) versus the no inhibition case (Control-OFF).

Figure 6 .Figure 7 .Figure 8
Figure 6.Schematic of the alternative allosteric control programs used in the structural particle swarm computation.Each network had the same enzymatic connectivity, initial conditions and kinetic parameters, but alternative feedback control structures for the first enzyme in the pathway.