The Double Phospho/Dephosphorylation Cycle as a Benchmark to Validate an Effective Taylor Series Method to Integrate Ordinary Differential Equations

: The double phosphorylation/dephosphorylation cycle consists of a symmetric network of biochemical reactions of paramount importance in many intracellular mechanisms. From a network perspective, they consist of four enzymatic reactions interconnected in a specular way. The general approach to model enzymatic reactions in a deterministic fashion is by means of stiff Ordinary Differential Equations (ODEs) that are usually hard to integrate according to biologically meaningful parameter settings. Indeed, the quest for model simpliﬁcation started more than one century ago with the seminal works by Michaelis and Menten, and their Quasi Steady-State Approximation methods are still matter of investigation nowadays. This work proposes an effective algorithm based on Taylor series methods that manages to overcome the problems arising in the integration of stiff ODEs, without settling for model approximations. The double phosphorylation/dephosphorylation cycle is exploited as a benchmark to validate the methodology from a numerical viewpoint.


Introduction
Protein phosphorylation is a ubiquitous regulatory mechanism for cells, generally working to activate or inactivate molecules [1]. From a biochemical viewpoint, to phosphorylate a molecule consists in the binding of a phosphoryl group PO − 4 [2]. The general model to deal with phosphorylation is to exploit the framework of enzymatic reactions, where the substrate M is supposed to be modified (phosphorylated, actually) into the product M p by means of the preliminary formation of a complex C provided by the binding of an enzyme (called kinase, K) in charge to catalyze the phosphorylation, see the scheme in (1).
Phosphorylations usually introduce conformational changes that activate/inactivate the enzymatic activity of a protein, or simply prime degradation processes, like the ones involved in the yeast cell cycle (see, e.g., in [3,4] dealing with the degradation of Sic1 and Whi5, respectively) or those involving tyrosine kinase pathways, regulating diverse cellular processes, and whose dysregulation is one of the leading causes of cancer progression [5]. In many important cases, multi-site phosphorylations are required to ensure the correct timing of activation [6,7]. Besides, according to specific conditions and biological frameworks, phosphoryl groups can be removed, making phosphorylation a reversible activation/inactivation mechanism. Moreover, in this case the dephosphorylation may be treated as a generic enzymatic reaction, with the enzymes that catalyze the reaction called phosphatases P: Combinations of multiple phosphorylations/dephosphorylations have been selected by nature as devices, providing specific useful biological functions. Within this framework, the Double Phosphorylation/Dephosphorylation Cycle (DPDC) is a symmetric network motif consisting of a double reversible phosphorylation step required to activate a given substrate M: where the final product is the doubly phosphorylated M pp . Ordinary Differential Equation (ODE) models of DPDC use to combine in a symmetric fashion the four enzymatic reactions [8], providing a bistability regime where the non-phosphorylated or the doublephosphorylated versions of the substrate is predominant [9]. A primary motivation in investigating the computation of the system solutions is that, in a recent paper [10], it has been proven that symmetry breaking in the dynamical solutions of such a network may lead to modify its emergent properties, including concentration robustness of different stationary solutions. In this context, numerical integration of enzymatic reactions in ODE form has been a matter of investigation for more than a century [11,12], since the seminal works of Michaelis and Menten [13] providing an approximation (the celebrated Quasi Steady-State Approximation (QSSA) [14]) to cope with the double time scale arising whenever biologically meaningful parameters are assigned. Multi-timescale phenomena are very common in biology, see, e.g., the very recent cardiac cell model in [15].
Indeed, in enzymatic networks, the binding/unbinding reactions use to occur at a faster rate, thus leading to stiff ODEs, thus characterized by numerical instability when ordinary numerical schemes are employed for their integration. In particular, in the case of stiff equations, ad hoc integration methods, like the one illustrated in [16] for linear multistep methods, can be employed to approximate the solution efficiently. Like any approximations, there are limitations that may render unfeasible its concrete applicability. Indeed, in [17] we showed how the numerical integration of a basic enzymatic reaction model may create serious problems even to well established procedures like the ones implemented by Matlab in ode45 and ode15s functions. Besides, things become even more crucial in the DPDC, as it has been shown in [18] how the QSSA may miss the bistability property.
In order to deal with stiff ODEs, Taylor Series Methods (TSMs) can be used. These methods build up a polynomial approximation (up to some fixed order k) of the ODE solution around the initial point through Taylor series expansion, which amounts to the recursive calculation of the partial derivatives of the ODE function at the initial point, up to the order k. With respect to standard Runge-Kutta methods, TSMs do not exhibit worse performance in terms of numerical stability, and guarantee a better accuracy in the solution calculation for higher degrees of the approximating polynomial. We refer the interested readers to the works in [19][20][21][22][23] for an in-depth description of TSMs and their numerical properties.
The main issue with TSMs is that, as the approximation order k increases, the calculation of all the required derivatives becomes too cumbersome, and some preliminary transformations need to be applied to the original problem in order to simplify the deriva-tives calculation. To this end, in the present work, we employ some recent technical results published in [24,25], according to which ODE systems can be embedded into higher-order quadratic equations which, in spite of their dimensionality, allow for more efficient differentiation, which in turn can be exploited for numerical integration via TSM. These results have been already exploited in [17] for the class of simple enzymatic reactions (two differential equations), and are here extended to the more challenging DPDC case (seven differential equations). Simulations are promising as the numerical results show a higher qualitative accuracy of the method with respect to standard off-the-shelf solvers for an appropriate choice of the reaction parameters which make the equations stiff.
The paper is organized as follows. Section 2 reviews the methods involving quadratization and approximate integration of a class of differential equations. Section 3 adapts the framework to the context of Double Phosphorylation/Dephosphorylation Cycle. Section 4 provides some numerical simulations of the DPDC system. Finally, Section 5 offers concluding remarks and ideas for further developments.

Exact Quadratization and Approximate Integration of σπ Differential Equations
This section recollects and adapts some results mainly developed in [24,25] for more general classes of dynamical systems than those developed in this work. Except where differently specified, we adopt the following vector and indices convention: a vector v is always a column vector, and v is its transpose. With respect to indices, given a vector v, the scalar v j denotes its j-th entry; instead, in case of double subscripts, like in Z i,j , when dealing with a vector we mean a nested notation, where Z i,j is the j-th scalar/vector component of vector Z i , and vector Z i is the j-th vector component of vector Z; on the other hand, when dealing with a matrix, Z i,j refers to the usual scalar entry in row i and column j.

Exact Quadratization of σπ-ODEs into Driver-Type Differential Equations
We consider a first-order ODE systeṁ where the function f is a formal polynomial of x, i.e., a polynomial writing where the exponents are allowed to be any real number. More formally, the components of f are in the form where p l i,r are real exponents, the ν i quantities X i,l in (5) are named monomials andv i,l are real coefficients. We refer to this kind of functions as σπ-functions, and to the associated system of differential equations as σπ-ODE.
For σπ-ODEs, the following theorem holds, by virtue of which the system (4) and (5) can be densely embedded in a higher-dimensional quadratic system. Theorem 1 (Exact Quadratization Theorem, adapted from the work in [24]). Any σπ-ODE in the form (4) and (5), with domain V ⊂ IR n , is quadratizable on the non empty, open and dense subset U = V \ S (with S denoting the set of all coordinate hyperplanes in R n ), and the quadratization is given by the following homogeneous Riccati equation in the indeterminates x i and Z i,l : where π l i,r = p l i,r − δ i,r , and δ i,r being the Kronecker symbol: .
It clearly appears that the specific Riccati equation constituting the quadratization (6) and (7) includes the original state variables, so it is possible to define an augmented state such that the original ODE system (4) is now embedded in the following extended system evolving in R m , with m ≥ n:ẋ where, with a little abuse, we keep the notation of x for a vector now living in R m , with the first n components provided by the n components of vector x in (4), and the other m − n components provided by vector Z in (7). The coefficients v i,j are suitable linear functions of the coefficientsv i,l in (5). Equation (8) is called 'Driver-type' ODE form, while the matrix The interested reader is referred to the works in [24,25] for further notation (including multi-indices), technical details and full proofs.

Approximate Taylor Series Integration Method
With a slight abuse of notation, let t → x(t) be the solution of (4) with initial condition x(t 0 ), and consider the Taylor expansion of a generic scalar component x i (·) with respect to the initial time instant t 0 : Numerical integration techniques based on the application of the truncated series in (9) to compute a solution of (4) are called Taylor Series Methods (TSMs). Unfortunately, such a series expansion cannot be straightforwardly applied to compute the solution for general systems as it requires the explicit computation of the derivatives of the solution at the initial point, which usually reveals to be too cumbersome. Below is reported a Theorem that allows to compute the coefficients of the Taylor expansion for a system in the 'Driver-type' ODE form (8).
Approximate numerical integration based on Theorem 2 can be readily performed by truncating the series (9) at a finite orderk, provided that the integration formula is reinitialized frequently enough to prevent numerical instability. This readily leads to the following iteration. Letx i (j∆), i = 1, . . . , n, for j = 0, 1, . . . , be the approximate value of x i at time j∆ provided by the algorithm, where t 0 = 0 and a fixed sampling time ∆ > 0 are assumed for simplicity. Then, the approximate solution at all times t ≥ 0 is given by Note that, fork = 1, the proposed integration scheme coincides with the forward Euler method.

The Double Phosphosphorylation-Dephosphorylation Cycle
According to the mass action law, the system of equations governing the dynamics of substrates and complexes in the DPDC scheme in (3) is where the following conservation laws hold: Enzymes K and P can be replaced in (13) according to (15), so that the overall dynamics is not redundant and the system reduces to the following 7-dimensional ODE with the constraint (14): Summing up all left hand sides of the system (16) yields zero, which, on the other hand, is entailed by (14), accounting that M T (the total mass) is constant over time. The algebraic equation (14) defines a six-dimensional manifold in IR 7 , invariant with respect to system (16), which means that, if one takes the initial value on this manifold, the evolution of system (16) remains on the same manifold at all times. Remark 1. Indeed, it can be readily proved the stronger result that the DPDC system is positive [26], as each non-positive term on the right-hand side of any equation in (16) multiplies the variable differentiated on the left-hand side of the same equation; in short, removal terms are linear in the variable of interest. This prevents the crossing of the coordinate hyperplane S already defined in Theorem 1, on which the aforementioned terms are zeroed; this will make the DPDC ODEs quadratizable (see the remainder of this section) on the same domain V ⊆ IR 7 ≥0 of the original variables, where the latter symbol denotes the non-negative orthant in IR 7 .

By setting
it is readily seen that Equation (16) defines a σπ-ODE (see Section 2.1), with state dimension n = 7, where the exponents in (5) are integer. As a consequence, Theorem 1 can be applied so that the system is exactly quadratized, with the quadratization that can be expressed in the 'Driver-type' ODE form (8), with augmented state dimension m = 24, by extending the system with the following adjoint variables, whose dynamics is defined in (7): Therefore, system (16) rewrites as follows: while we can use the chain rule to compute the dynamics of the adjoint variables, resulting, after some substitutions, into the following additional quadratic differential equations: Finally, we obviously haveẋ 24 = 0 initialized to x 24 (0) = 1, implying x 24 (t) = 1 for all t ≥ 0, which allows to turn linear into quadratic terms in the right-hand side of any σπ-ODEs.

Simulation Results
Numerical simulations of the system (16), quadratized in the form (8) as developed in the previous section, have been performed in the Matlab® suite. Inspired by the parameter choices in [14,17], we consider the following parameters: with total amounts (14) and (15) equal to We start from the initial conditions (at time t 0 = 0) where the initial states of the adjoint variables x j (0), for j = 8, . . . , 24, are uniquely determined from (17) and (20). The sampling interval has been set to ∆ = 0.015 s and the overall simulation time is 1 s. For the given choice of parameters and initial conditions, simulations performed according to the standard ODE45 Matlab® solver, based on the Dormand-Prince Runge-Kutta method [27] with default settings, are aborted by Matlab for not being able to meet integration tolerances. Figures 1-7 show the numerical solution obtained for variables x 1 , . . . , x 7 by means of the ODE15s Matlab® variable-order solver for stiff differential equations [28] compared with the method of quadratization and truncated TSM integration proposed in this paper. It is clearly observed that the substrate trajectories provided by ode15s oscillate, while our approach provides a smoother behavior. Regards to the complexes, ode15s trajectories become negative, which is qualitatively inconsistent by virtue of Remark 1, as we know that the system (16) is positive. The same occurs with C 1 for the TSM method with truncation orderk = 1 (Euler), see Figure 4, and with C 2 for the TSM method withk = 2, see Figure 5. Instead, the TSM method applied to the quadratized system with truncation orderk = 3 exhibits non-negative solutions for all the variables.  Figure 8 shows the trajectories of complexes C 1 (top panel) and C 2 (bottom panel), obtained with sampling time ∆ = 0.015 s, compared with the same trajectories computed with a doubled sample interval ∆ = 0.03 s, and the same truncation orderk = 3 of the TSM method. It is apparent that the choice of a sufficiently small sampling interval, jointly with a sufficiently high truncation order, is crucial for the numerical stability of the algorithm. In particular, both C 1 and C 2 violate again the non-negativity constraint in the case ∆ = 0.03 s, as already observed for ∆ = 0.015 s and low truncation orders in Figures 4 and 5, with a notable oscillating behavior and an evident initial overshoot of species C 2 . Note that the pattern of damped oscillations in Figures 4, 6, and 8 is not surprising in numerical simulation and identification in biochemical and biological contexts, for instance, a similar behavior may indeed result from the optimization of the power of the error in multiple linear regression under the assumption of generalized Gauss-Laplace distribution [29].
In summary, the method proposed in this paper seems to be able to return meaningful solutions in the numerical simulation of particular biological conditions when standard solvers may fail.

Discussion
In this work, we proposed a novel approach to the problem of integrating the solution of biological systems expressed by stiff differential equations (for example, those exhibiting an apparent double time-scale separation). To overcome typical numerical issues related to the numerical integration of this kind of system by means of existing solvers, we here rely on recent work exploiting the so-called quadratization of ODEs, which allows embedding the original dynamics in a higher-dimensional space where the system is quadratic and differentiation formulae of any order are computable by means of simple recursions. Such derivatives are exploited within a truncated Taylor Series expansion to build an approximate simple integration scheme, which is proved to work accurately in the in silico simulation of the Double Phoshphorylation/Dephoshphorylation Cycle (DPDC), which is an important regulatory mechanism present in cells. Ongoing and future work is focusing on the construction of a numerical scheme that might overcome the curse of dimensionality due to the computation of coefficients of the high-order terms in the Taylor series expansion for the augmented quadratized dynamics.
Author Contributions: Conceptualization, investigation, writing-review and editing, resources, A.B., F.C. and P.P.; writing-original draft preparation, formal analysis, A.B. and P.P.; methodology, F.C.; software, A.B.; supervision, P.P. All authors have read and agreed to the published version of the manuscript.