An Automated Model Reduction Method for Biochemical Reaction Networks

: We propose a new approach to the model reduction of biochemical reaction networks governed by various types of enzyme kinetics rate laws with non-autocatalytic reactions, each of which can be reversible or irreversible. This method extends the approach for model reduction previously proposed by Rao et al. which proceeds by the step-wise reduction in the number of complexes by Kron reduction of the weighted Laplacian corresponding to the complex graph of the network. The main idea in the current manuscript is based on rewriting the mathematical model of a reaction network as a model of a network consisting of linkage classes that contain more than one reaction. It is done by joining certain distinct linkage classes into a single linkage class by using the conservation laws of the network. We show that this adjustment improves the extent of applicability of the method proposed by Rao et al. We automate the entire reduction procedure using Matlab. We test our automated model reduction to two real-life reaction networks, namely, a model of neural stem cell regulation and a model of hedgehog signaling pathway. We apply our reduction approach to meaningfully reduce the number of complexes in the complex graph corresponding to these networks. When the number of species’ concentrations in the model of neural stem cell regulation is reduced by 33.33%, the difference between the dynamics of the original model and the reduced model, quantiﬁed by an error integral, is only 4.85%. Likewise, when the number of species’ concentrations is reduced by 33.33% in the model of hedgehog signaling pathway, the difference between the dynamics of the original model and the reduced model is only 6.59%.


Introduction
A mathematical model of a chemical reaction network (CRN) can be represented by a system of ordinary differential equations (ODEs) that describe the dynamics of the concentrations of the constituent species participating in the network. Generally, kinetic models of most CRNs involve an immensely large number of variables, thereby making the models difficult to analyze due to the huge computational effort required for the analysis. On top of that, as in general not all of the species' concentrations can be measured in an experimental set-up, identifying the parameters contained in the mathematical model is not feasible and demands massive experimental datasets. Thus, there exists a necessity to simplify the mathematical models of CRNs to reduced variants that preserve the key properties of the original models but contain fewer ordinary differential equations and fewer unknown variables. Numerous model reduction methods for CRNs are described in the literature. For a comprehensive survey of popular model reduction techniques, see, for example, in [1,2]. The quasi-steady-state approximation (QSSA) (see, for example, in ( [3] Chapter 2) and ( [4] Chapter 4)) is one of the most commonly used tools for model reduction of CRNs governed by enzyme kinetics. Even though QSSA can be used for a large variety of CRNs, in many cases it demands high computational efforts. The method is not simple to apply, as it requires a priori experimental (biological) knowledge.
A recent model reduction method proposed in [5] reduces the mathematical model of a CRN by eliminating complexes from its corresponding complex graph. In chemical reaction network theory, complexes are defined as the left-(substrate) and right-hand (product) sides of the reactions. Furthermore, the complexes of a CRN can be naturally associated with the vertices of a complex graph with edges corresponding to the reactions of the network. The method described in [5] selects the complexes to be eliminated in such a way that the behavior of the remaining relevant species in the reduced CRN is close to their original behavior. It is based on the Kron reduction method [6], and reduces the weighted Laplacian matrix associated with the complex graph. The Kron reduction method is a popular method of model reduction for electrical networks and proceeds by the computation of Schur complements [7] of the weighted Laplacian matrix. As an example, let us consider the following CRN consisting of a single linkage class, where C j , j = 1, · · · , c, are distinct complexes and ν j , j = 1, · · · , c − 1, are the overall reaction rates in the forward direction. We recall from the work in [5] that a linkage class is a connected component of the complex graph corresponding to the CRN. After eliminating the complex C 2 using the method proposed in [5], we obtain the following reduced network.
Here, the principle of complex balancing is applied to C 2 to replace the two reversible reactions C 1 ν 1 C 2 and C 2 ν 2 C 3 of the original network by the new reversible reaction C 1ν C 3 . We say that C 2 is complex balanced if its inflow is equal to its outflow. In other words, the concentrations of the species participating in C 2 are adjusted in such a way that ν 1 = ν 2 , which leads to the deletion of C 2 from the complex graph. Moreover, the reaction rates ν j , j = 3, · · · , c − 1, of the other reactions remain unchanged. Note that the goal of considering this example is to illustrate how the reduction procedure in [5] is carried out. It describes how the aforementioned method is used to delete a complex (in this case the complex C 2 ) from the corresponding complex graph. Thus, without loss of generality, we assume that the deletion of C 2 does not cause major changes in the dynamics of the considered chemical reaction network. This method is effective in eliminating intermediate complexes from linkage classes that have more than one reaction. We note that an intermediate complex is a complex participating in more than one reaction, with each reversible reaction considered as a single reaction. For instance, with reference to the network given above, C 2 is an intermediate complex, but C 1 is not. However, as mentioned in [5,8], when every linkage class has just two complexes, the method fails in producing a meaningful reduction as it just deletes reactions from the network. For example, consider the following part of a kinetic model of yeast glycolysis, R 1 : Glco ν 1 Glci, where ν j is the overall rate of the jth reaction in the forward direction. Note that a reaction rate is a function of species' concentrations of the network and it depends on the governing law of the corresponding reaction. In this case, elimination of any complex by the approach proposed in [5] results in deleting the corresponding reaction. For instance, if we delete the complex G6P + ADP from the network, the method will result in deleting the entire second reaction R 2 from the network. The reduced CRN will then have the reaction R 1 occurring independently of the remaining reactions. Therefore, the reduced network will possess a behavior that is completely different from the original one. It is easy to see that the complex graph corresponding to (1) is not connected. As this complex graph does not have any linkage class that contains more than one reaction, the method proposed in [5] does not result in a meaningful reduction of the corresponding model. In this case, there is a need to find a reasonable method of rewriting the complex graph of the given network as a graph with linkage classes that have more than one reaction in order to ensure that the application of the method proposed in [5] results in a meaningful reduction in the number of complexes. We propose to eliminate the species ADP from the network in such a way that we can replace the reactions R 2 and R 4 with the reversible reactions Glci + ATP G6P and F6P + ATP F16BP, respectively. This replacement can be done, for example, by expressing the concentration of ADP in terms of the concentrations of ATP. This can be done by using the two ODEs that represent the change in concentration over time for ADP and ATP. These ODEs can be represented in the following form, where [X] denotes the molar concentration of the species X. From these equations, we easily derive the following conservation law, which express the concentrations of ADP in terms of the concentration of ATP, where C is a non-negative constant (known as a conserved quantity) depending on the initial concentrations of ATP and ADP. The conservation law (2) can now be used to eliminate the concentration of the species ADP from the mathematical model of the network. This elimination allows us to rewrite the mathematical model as a model of a network containing two linkage classes that consist of more than one reaction.

Glco Glci
Note that even though the species ADP is not explicitly participating anymore in the CRN, its concentration still appears in the corresponding mathematical model in terms of the concentration of ATP. The method in [5] may now be used on the network (3) to reduce its mathematical model in a meaningful way by deleting the species (complexes) G6P and F16BP from the corresponding complex graph. Even though the elimination of the concentration of ADP enables the application of the reduction method [5], the reduced model is valid only for trajectories for which the total concentration of the pool of ADP and ATP is fixed.
Most CRNs in real-life have reaction network structures similar to the original yeast glycolysis network (1), with certain linkage classes consisting of only one reaction. New tools for model reduction are required for these types of networks. In this paper, we generalize the approach of the complex graph rewriting procedure briefly illustrated above. This procedure, in conjunction with the method proposed in [5], results in a meaningful model reduction.
The main idea of our model reduction method is to enable the application of the model reduction method proposed in [5] by joining certain linkage classes of a CRN into a single linkage class using linearly independent conservation laws. An important result related to conservation laws is Noether's theorem [9], which states that there is a one-to-one correspondence between a conservation law and a differentiable symmetry of nature. For instance, an example of a conservation law in the field of classical physics is the law of conservation of energy in Hamiltonian systems, which corresponds to time translation symmetry.
We develop an automation process for our model reduction method in order to make it straightforward to apply for a wide range of CRNs governed by any kind of enzyme kinetics. In the automation procedure we assume that the CRN admits a single steady state and is asymptotically stable around it. This assumption is necessary for determining the settling time of the network over which we compare the original model and the reduced model. We created a MATLAB library, which contains all the files corresponding to each step of our step-wise reduction procedure. It uses the input files containing the information of a given CRN provided by the user to reduce the corresponding mathematical model in a fully automated fashion.
Finally, we exhibit the application of our new technique of model reduction for real-life examples of CRNs drawn from the Biomodels database [10]. For each of the cases, we explain how the corresponding mathematical model can be meaningfully reduced.

Preliminaries
In this section, we give a compact description of the mathematical techniques that are necessary for our automated model identification method. We commence by introducing the notations that are used throughout the manuscript and continue by explaining a framework for deriving a system of ODEs that describes the dynamics of the constituent species of a given CRN. Subsequently, we recall the Kron reduction method proposed in [5].
S P denotes a reaction from the substrate S to the product P, where the dotted line indicates that it can be reversible or irreversible. In this case, when referring to its reaction rate we refer to the outgoing overall reaction rate in the forward direction. The transpose and the null space of an m × n matrix A = [A ij ], j = 1, · · · , n, i = 1, · · · , m, are denoted by A and ker(A), respectively. The number of elements of a set I is denoted by n(I). For any vector v ∈ R n , denote by diag(v) the n × n diagonal matrix whose diagonal elements are the corresponding elements of the vector v. Define the function Exp : The Schur complement of the block matrix D ∈R q×q of the matrix L ∈ R p×p is denoted by L D ∈ R (p−q)×(p−q) .

Mathematical Models of Reaction Networks
Consider a CRN that involves m distinct chemical species, c complexes, and has r unidirectional reactions. We present a framework for describing the dynamics of reaction networks using their complex graphs [5,11]. The c complexes of the network are represented by an m × c complex composition matrix Z, whose columns express the complexes in terms of their species. Any CRN is defined by a c × r incidence matrix B, where if the ith complex is the product of the jth reaction, 0, otherwise.
It can be shown that the m × r stoichiometric matrix S of the network is S = ZB. We denote by ν(x) ∈ R r + the vector of reaction rates of the given CRN. The basic structure underlying the dynamics of the species' concentration vector x ∈ R m + is given by the balance law [12]: Let ξ ∈ ker(S ). Using (4) we easily see that this vector satisfies the condition ξ dx dt = 0. Consequently, ξ x is a constant, and thus where x 0 ∈ R m + is the vector of initial species' concentrations, i.e., x 0 = x| t=0 . Equation (5) is called a conservation law of the CRN. The constant term ξ x 0 appearing in the conservation law (5) is called a conserved quantity. Clearly, the number of linearly independent conservation laws is equal to the dimension of ker(S ). In other words, irrespective of the governing laws, we can obtain a maximal set of linearly independent conservation laws by computing ker(S ).
Let Z S i be the column of Z corresponding to the substrate S i of the ith reaction. Assuming that every reaction of the CRN is governed by an enzyme kinetic rate law, the reaction rate of the ith unidirectional reaction between the substrate S i and product P i , as explained in [5], can be expressed as where, for every i = 1, · · · , r, d i : R m + → R + is a rational function and k i denotes the rate constant of the ith reaction. Note that if the governing law of the ith reaction is mass action kinetics, then d i (x) ≡ 1. Consider the r × r diagonal matrices D(x) = diag(d 1 , · · · , d r ) and K = diag(k 1 , · · · , k r ). Define the c × r outgoing matrix ∆ = [∆ ij ] as follows, It can be shown that the c × c weighted Laplacian matrix associated with the complex graph of the network is L(x) = BKD(x)∆ . As explained in [5], the balance laws (4) can be expressed in terms of the Laplacian matrix as where C(x) = Exp Z Ln(x) . Some well-known enzyme kinetic rate laws for which the balance laws can be written in matrix multiplication form (7) are the reversible Michaelis-Menten kinetics [13], Monod-Wyman-Changeaux kinetics [14], Hill kinetics [15,16], Ping Pong Bi Bi kinetics [17], and Koshland-Nemety-Filmer kinetics [18]. In order to clearly illustrate the modeling procedure, we demonstrate it for the following reaction network with four unidirectional reactions, where ∅ is the zero complex representing the environment. It is treated as a regular complex, in which the number of moles of each species is zero as explained in [11]. More precisely, the first reaction is a creation of X 1 and the last reaction is a degradation of X 4 . In this example, the substrate complexes of the four reactions are S 1 = ∅, S 2 = 2X 1 + X 2 , S 3 = 3X 3 + X 4 , and S 4 = X 4 , respectively. Thus, Z S 1 = 0 0 0 0 , Z S 2 = 2 1 0 0 , Z S 3 = 0 0 3 1 , and Z S 4 = 0 0 0 1 . The complex composition matrix and the incidence matrix corresponding to the network (8) are given by Let V f and V r be the maximum reaction rates of the second and the third reactions of (8), respectively. The expressions for the reaction rates (6) depend on the governing laws of the reactions. One possibility is where K i is the Michaelis constant of the species X i for i = 1, · · · , 4, and p 1 , p 2 are positive constants. The reaction rates of the network (8) can be represented in the form (6) by defining the rate constants k i as and the rational terms d i (x) in the expressions of reaction rates as The 5 × 5 Laplacian matrix corresponding to the network (8) is Note that open CRNs are modeled in [5] by extended balance laws of the form where the free term f ∈ R c represents the inflows to and outflows from the complexes of the network. Here, as the zero complex is considered as a regular complex it is included in the Laplacian matrix in order to avoid the unnecessary free term. On top of that, unlike the reduction method proposed in [5], if the zero complex is an intermediate complex, we consider it as a candidate complex for deletion.

Kron Reduction of Mathematical Models Corresponding to Reaction Networks
We describe the model reduction method proposed in [5]. It is based on the Kron reduction [6] and is performed by deleting a set of complexes from the complex graph associated with the CRN. Recall that deleting a complex is equivalent to assuming that it is complex balanced. The complex balancing is carried out by computing the Schur complement [7] of the weighted Laplacian matrix associated with the corresponding complex graph. Definition 1. Let L be an (m + n) × (m + n) block matrix of the form where A, B, C, and D are n × n, n × m, m × n, and m × m matrices, respectively. Assume that D is invertible. The Schur complement of D is the n × n matrix L D defined by Consider again a CRN of c complexes and r reactions with dynamics modeled by the balance laws (7). Let V be the set of indices corresponding to the complexes of the CRN, i.e., V = {1, · · · , c}. Assume that we want to delete the complexes with the set of indices V, which is a subset of V of cardinality] c < c. Denote by D(x) ∈ R c× c the block matrix of the Laplacian matrix L(x) that corresponds to the set of indices V. The complex deletion is carried out by computing the Schur [7]) of the block matrix D(x). It is proven (see in [5] Proposition 1) that L D (x) is again a Laplacian matrix, i.e., it satisfies the properties of Laplacian matrices [19]. On top of that, it is shown that the equation describes the dynamics of a CRN governed by an enzyme kinetics rate law, i.e., the corresponding reaction rates can be written in the form (6), with smaller number of complexes. Here, Z red is the complex composition matrix of the reduced CRN defined by removing from Z the columns corresponding to the set of indices V. A suitable choice of V ensures that some of the elements in the vector of species concentrations x are conserved in time, i.e., the derivatives of some of the elements of x are zeros. This leads to a reduced number of dependent variables in the corresponding mathematical model. As an example consider the following simple CRN of two unidirectional reactions, Let ν 1 and ν 2 be the overall rates in the forward direction of the first reaction and the second reaction, respectively. For i = 1, · · · , 4, denote by x i the concentration of the species X i . In this case, the balance laws (4) can be rewritten as Now suppose that the second complex X 2 + X 3 is complex balanced, which is equivalent to assuming that ν 1 = ν 2 . From the above-mentioned balance laws we obtain dx 2 dt = dx 3 dt = 0, which means that x 2 and x 3 are conserved in time.
In [5], the difference between the dynamical behaviors of the original model and the corresponding reduced model is quantified by an error integral E. Let [0, T] be the time interval over which we want to observe the difference between the behaviors of the original model and the reduced model. Denote by J the set of indices of the significant species of the network. The error integral E is defined as where x i and y i are the concentrations of the ith species in the original model and the reduced model, respectively. Note that in [5] the right endpoint of the time interval has been set manually. However, in our case this right endpoint is chosen to be the settling time of the network, i.e., the time instant at which the considered CRN reaches a sufficiently small prespecified neighborhood of its unique steady state. The model reduction method proposed in [5] is an iterative procedure that selects a subset of intermediate complexes to be deleted from the corresponding complex graph as follows. First, the candidate complexes for deletion, i.e., the intermediate complexes of the network, are identified.
For each candidate complex the error integral corresponding to the reduced model obtained after its deletion is evaluated. These values of error integrals are used to rank the candidate complexes and then delete the candidate complex with the least value of error integral. In the next step, the above-mentioned procedure is repeated with the reduced model. The iteration is continued until the error integral E(T) exceeds a prespecified threshold value.
One of the advantages of the reduction method [5] is that it does not rely on a priori biological knowledge about the CRN. Moreover, it is an easy-to-implement model reduction method that can be used to simplify huge mathematical models of CRNs by retaining the original kinetics.

Equivalent Mathematical Models
In this section, we give a detailed explanation of the basic idea of our reduction technique which we briefly introduced in Section 1. We first establish concepts that are relevant to the model reduction procedure.

Definition 2.
The index set I C j ⊂ {1, · · · , m}, j ∈ {1, · · · , c}, corresponding to the j th complex C j of the network, is the set of indices of the species participating in the expression of the complex C j .
We thus obtain the following expression for the complex C j in terms of its species, where a C j ,i ∈ R + represents the number of moles of the ith species in the expression of the complex C j . It is clear that two complexes C p and C q share common species if and only if Definition 3. Let C be a complex of a CRN. For ε ∈ N, we define εC as the new complex consisting of the species of C with their respective number of moles multiplied by ε, i.e., εC = ∑ i∈I C εa C,i X i .
With the following lemma, we show the equivalence between two mathematical models of a given CRN. Later on, we will see that the application of this lemma allows us to join certain reactions into a single linkage class.
j=1 be the vector of overall reaction fluxes in the forward direction of the CRN R j : S j P j , j = 1, · · · , r. For every ε = [ε j ] r j=1 ∈ N r , the reaction network is equivalent, in terms of its mathematical model, to the reaction network ε j S j ε j P j , j = 1, · · · , r, with the vector of overall reaction fluxes being ν(x) = 1 Proof. Let X α , α ∈ {1, · · · , m} be a random species of the CRN ε j S j ε j P j , j = 1, · · · , r. Using (4), the rate of change of the concentration x α of species X α is given by As the right-hand side of this equation represents the rate of concentration change of species X α in the original network, the proof of Lemma 1 is complete.
, have the same mathematical models, they also possess the same conservation laws.
In the following proposition, we generalize the idea of the complex graph rewriting procedure, as was roughly illustrated in Section 1. We show that in the presence of a shared species between two complexes, a suitable conservation law can be used to join the corresponding reactions with the shared species as intermediate complexes.
j=1 be the vector of overall reaction rates in the forward direction of the CRN R j : S j P j , j = 1, · · · , r. Suppose that, for certain j 1 , j 2 ∈ {1, · · · , r}, two complexes, P j 1 and S j 2 , of different reactions share a common species X α , α ∈ {1, · · · , m}. Define I = ( and I S j 2 are the sets of indices corresponding to the complexes P j 1 and S j 2 , respectively. If for every β ∈ I, there exists a conservation law in which the species X β is participating, but not the remaining species corresponding to I, then we can join the two reactions, R j 1 and R j 2 , into a single linkage class R , with the corresponding reaction rates being The intermediate complex C α is the shared species X α with number of moles a C α ,α = a P j 1 ,α a S j 2 ,α , i.e., C α = a P j 1 ,α a S j 2 ,α X α . Proof. For every β ∈ I, choose a vector ξ ∈ ker(S ) such that the corresponding conservation law (5) satisfies the assumptions of the proposition. Note that this vector strictly depends on β. More precisely, its βth component ξ β is not equal to zero. Moreover, note that all the other components of ξ corresponding to I are zeros. Denote by ξ the vector ξ with its βth element replaced by zero. Using (5) we represent x β in terms of the concentrations of the other species: where x 0 is the vector of species' initial concentrations. By substituting x β as in the above equation in the expression of the vector ν(x) of the reaction rates, we can rewrite the reactions S j k P j k , k = 1, 2, in the following forms Application of Lemma 1 with completes the proof.
As (10) represents a linkage class with more than one reaction, the method proposed in [5] can be meaningfully applied to delete the intermediate complex C α from the corresponding complex graph. In the reduced network, the linkage class (10) can be replaced by a single reaction R : a S j 2 ,α S j 1 a P j 1 ,α P j 2 using the principle of complex balancing.

Remark 2.
We would like to point out the three major steps of the complex graph rewriting procedure described in the proof of Proposition 1. Later on, these steps will be used to automate our model reduction procedure.
M1. Elimination of certain species from the network (rewriting the corresponding concentrations in terms of the concentrations of the other species), after which certain complexes are composed of the same species. M2. Application of Lemma 1 in order to make such complexes identical to each other. M3. Joining the reactions with identical complexes into a single linkage class.

Automatic Reduction Procedure
In this section, we describe the automatic model reduction procedure step by step. We show how to apply Proposition 1 for any kind of CRN independently of its governing laws to join certain reactions in an automatic way. We then apply the strategy in [5] to reduce the number of complexes in the complex graph corresponding to the equivalent network.
Inputs: The list of inputs required for the model reduction procedure is as follows.
• The complex composition matrix Z ∈ R m×c , where m and c are the number of the species and complexes of the network, respectively.

•
The incidence matrix B ∈ R c×r of the network, where r is the number of reactions of the network.

•
The vector k ∈ R r + of rate constants of the reactions.

•
The vector d(x) ∈ R r + of rational terms in the expressions of reaction rates.

•
The vector x 0 ∈ R m + of initial concentrations.

•
The threshold value of the error integral (9), i.e., the maximum admissible value of E.
Outputs: The automatic reduction procedure provides the following outputs.

•
The mathematical model and the complex graph corresponding to the original network. • The mathematical model and the complex graph corresponding to the reduced network.

•
The final value of the error integral.
We divide our model reduction procedure into seven steps. In order to clearly explain the reduction procedure throughout this section, we will demonstrate it step by step for the following example of six unidirectional reactions: with the corresponding reaction rates being Step 1: Mathematical model of the network.
In the first step, we determine the system of differential Equation (7) that underlies the dynamics of the species' concentration vector x ∈ R m + .
Step 2: Settling time of the network.
Each CRN is uniquely described by the system (7) obtained from its structure. We can then commence considering its model reduction. An important tool, which is crucial in our reduction procedure, is the settling time of the considered CRN defined in Section 2.2. We show how to automatically compute it for a given CRN.
It has been proved in [20] that for a class of networks called zero deficiency networks, there exists a unique point satisfying the following system, where ξ i , i = 1 · · · , l, form a maximal set of linearly independent conservation laws. In [11,21,22], the same result has been proved for a related class of networks called complex balanced networks.
It states that such a network is asymptotically stable around its unique steady state corresponding to a particular initial concentration x 0 . We assume that the same property is true for the CRNs considered for our reduction procedure as mentioned in the introduction. Solving the system (13), we find the unique steady state x * ∈ R m + . The settling time of the CRN can then be automatically computed by the so called bound function defined as follows. For a fixed δ 1, the δ-bound function is given as Note that the bound function is computed component-wise. For an appropriate choice of δ, it is an easy numerical task to find the vector τ ∈ R m + of settling time, such that τ i , i = 1, · · · , m, is the time when the bound function corresponding to the ith species becomes 1 and remains so.
Step 3: Selecting species to be eliminated from the network.
Next, we identify the set I of indices corresponding to the species whose elimination from the network results in joining certain reactions. This can be done by using the index sets I C j , j = 1, · · · , c, corresponding to the complexes C j of the network, which can be easily extracted from the complex composition matrix Z. For a given pair of complexes C p and C q that share at least one species X α , define I p,q = I C p ∪ I C q \ {α}. If C p and C q share no species, set I p,q = ∅. It is reasonable to define the set I of indices to be eliminated from the network as I = p,q I p,q . We note that if there exists a complex C, which is made up of a single species X β with β ∈ I, i.e., C = X β , the elimination of X β from the network will end up eliminating the entire complex C, meaning that C becomes the zero complex.
The above-mentioned procedure allows us to determine the set of indices I in a fully automated manner. For example, in the case of the network (12), we obtain I = {3, 5, 6, 8}.
Step 4: Mathematical model of the equivalent network.
In this step, we show how to determine the mathematical model of the equivalent network obtained after eliminating the species corresponding to I from the original network. For every β ∈ I, we find the corresponding conservation law that can be used with Proposition 1 to eliminate the concentration x β from the mathematical model of the original network. It can be done by computing ker(S β ), where S β is the matrix obtained by replacing all the rows of the stoichiometric matrix S that correspond to the elements of I \ {β}, with zero rows. If the assumptions of Proposition 1 are satisfied, then it can be used to join certain reactions. The mathematical model of the equivalent network is then determined according to the steps described in Remark 2. We give a detailed explanation of how Z, B, k, d, and x 0 for the corresponding model change after each step.

M1.
As the eliminated species are no longer participating in the equivalent network, we delete the corresponding rows from Z. Subsequently, the vector x ∈ R m of species' concentrations is defined by deleting the βth, β ∈ I, element of x. Similarly, we obtain the vector x 0 ∈ R m of initial concentrations of the equivalent network. The vector d( x) of the rational functions can be derived from d(x) in the following way. For every β ∈ I, if the species X β is participating in the substrate complex of jth reaction, then d j ( x) = x −S βj β d j (x), which ensures that the reaction fluxes of the network obtained at this step still obey the Equation (6). For the network (12), it is clear that the species X 3 is participating in the substrate complex of the second reaction. After rewriting the concentration function x 3 in terms of concentrations of the other species, we multiply d 2 by x 3 . Thus, . We therefore write the reactions of the network (12) in the following form.
After the elimination of species, the columns of Z corresponding to the complexes that share species become multiples of each other. Consequently, in order to make these columns identical to each other, we multiply each of them by the corresponding constant from (11). Likewise, we divide each rate constant k j by the corresponding constant given in (11). In the case of the network (12), the vector of rate constants of its corresponding equivalent network is We therefore rewrite the reactions (14) in the following form.

M3.
We now delete the duplicate columns of Z and keep only one of them in order to find the complex composition matrix Z of the equivalent network. Suppose that Z ∈ R m× c , where m and c are the number of species and complexes in the equivalent model, respectively. It is clear that, if n(I) = n 1 and the number of duplicate columns is n 2 , then m = m − n 1 and c = c − n 2 + 1.
Let the pth and qth complexes be a pair of identical complexes. We first replace the pthrow of B with and delete its qth row. We then repeat the same technique for all the pairs of identical complexes to obtain the incidence matrix B of the equivalent network. It is clear that B ∈ R c×r .
The complex composition matrix Z ∈ R 6×4 and the incidence matrix B ∈ R 4×6 of the equivalent network corresponding to the network (12) are Thus, the equivalent network corresponding to the network (12) becomes Define K := diag( k), where k is the vector of rate constants of the equivalent network. Then, the Laplacian matrix L( x) ∈ R c× c of the equivalent network is L( x) = B K D( x) ∆ , where D( x) = diag( d) and ∆ is the outgoing matrix of the equivalent network. The mathematical model of the equivalent network is where C( x) = Exp( Z Ln( x)).
In some cases, the equivalent network may consist of independent subnetworks, which contain sets of reactions that do not share any common species. In this situation, we determine all the independent subnetworks of the equivalent network. First, we determine the linkage classes of the equivalent network using its incidence matrix B. The index sets corresponding to the complexes of the equivalent network are then used to determine the indices of the species participating in each linkage class. More precisely, if C j 1 , C j 2 , · · · , C j p are the complexes of the same linkage class L, then the set of indices of L is . It is clear that two linkage classes L 1 and L 2 are dependent, if I L 1 ∩ I L 2 = ∅. Mutually dependent linkage classes form an independent subnetwork of the equivalent network.
Let N j , j = 1, · · · , l, be the independent subnetworks of a network having more than one reaction. As they are independent, we can consider each of them as a separate network. Thus, we consider the reduction of each of them separately. We note that in the case of the network (12), there is only one subnetwork in the corresponding equivalent network.
Step 6: Selecting complexes for deletion.
In this step, we apply the reduction procedure of [5] to meaningfully delete certain complexes from the equivalent network. The first step of this procedure is to determine the candidate (intermediate) complexes for deletion. We note that the intermediate complexes are the ones that participate in more than one reaction, with each reversible reaction considered as a single reaction. It is reasonable to consider all the other complexes as important complexes. We consider the constituent species of the important complexes as the important species. In [5], the candidate complexes for deletion have been chosen manually. Here, we show how to determine such complexes in an automatic way. Let B out be the incidence matrix of the equivalent network, with each reversible reaction considered as a single reaction. B out can be found by deleting the columns of B corresponding to the reverse reactions of reversible reactions. The intermediate complexes can then be determined from B out by detecting the rows that contain more than one nonzero element.
In the case of the network (15), we have The candidate complexes for deletion from the linkage class (15) are 30X 4 and 10X 7 . The important species are X 1 , X 2 , X 9 , and X 10 , which are the constituent species of the important complexes 15X 1 + 15X 2 and 2X 9 + 2X 10 .
Let J i be the set of indices of the important species of N i . As explained in [5], we rank the candidate (intermediate) complexes according to the error integral E i (T i ) defined in (9). Here, T i = max j∈J i τ j is the settling time of N i . We delete from N i the optimal combination of complexes, i.e., the combination with the maximum value of E i (T i ) that does not exceed the predefined threshold.
Step 7: Mathematical model of the reduced network.
Let M i del be the set of indices of complexes deleted from N i , i = 1, · · · , l. Define M del = l i=1 M i del . In the final step, we show how to determine the mathematical model of the reduced network after deleting the complexes with indices M del from the equivalent network. In order to obtain the complex composition matrix Z of the reduced network, we delete the columns of Z corresponding to M del . Define M red = {1, · · · , c} \ M del , which is the set of indices corresponding to the complexes remaining in the reduced network. According to Proposition 1 in [5], the Laplacian matrix L( x) of the reduced network is defined as the Schur complement of L( x) corresponding to M red . The mathematical model of the reduced network is then given by where C( x) = Exp Z Ln( x) .

Application to Real-Life Reaction Networks
In this section, we apply our automatic model reduction method to reduce two computational models of biological processes that consist of several linkage classes, with common species shared between at least two of them. These models have been retrieved from the BioModels database [10], which is a repository of computational models of biological processes available online at https:// www.ebi.ac.uk/biomodels/. For both models considered throughout this section, the threshold value of the error integral for stopping the iterative reduction procedure has been set to 0.15. In general, this threshold can be chosen depending on the desired closeness of the reduced model to the original model. Detailed explanations of the reduction procedure of these models are provided in Appendix A.

Neural Stem Cell Regulation
We consider a mathematical model of neural stem cell regulation (NSCR). The complex graph corresponding to the model is given in the left-hand panel of Figure 1. For the corresponding detailed mathematical model we refer to [23]. The concentrations of the important species participating in the original model are represented in Figure 2.
We note that there is one linkage class with more than one reversible reaction. The remaining linkage classes of the network contain only one reversible reaction. In this case, the method proposed in [5] can be meaningfully applied to delete the intermediate complexes X 7 , X 8 , and X 13 . Further reduction of the network using the same method would eliminate reactions from the network.
To overcome this shortcoming and to obtain a meaningful model reduction of the network, we apply our new reduction technique.
Step 3 of our reduction procedure selects the species X 5 , X 6 , and X 20 to be eliminated from the mathematical model, which allows us to join three linkage classes into a single one. Detailed explanation of the elimination procedure can be found in Appendix A.1. In particular, we use three conservation laws to complete the aforementioned elimination of species. However, as a downside of our elimination procedure, the obtained reduced model is valid only for trajectories for which the three conserved quantities have fixed values as in (A2). For example, the first of these three conserved quantities is the total concentration of the pool of Notch [24] and Notch transmembrane [25], and it is reasonable to assume that in an experimental set-up this pool has a fixed concentration. The complex graph corresponding to the equivalent network of NSCR is shown in the middle panel of Figure 1. Finally, applying the procedure described in Step 6 to this equivalent network, we delete the optimal combination of the intermediate complexes, which consists of X 4 , X 8 , X 13 , and X 21 . The complex graph corresponding to the reduced network is given in the right-hand panel of Figure 1. Even though 33.33% of the species have been deleted from the original model, the error integral is only 4.85%. The concentrations of the important species participating in the reduced model are represented in Figure 2.

Hedgehog Signaling Pathway
Subsequently, we have used our new technique of model reduction to a very different type of biochemical reaction network, namely, a model of hedgehog signaling pathway (HSP). The corresponding detailed mathematical model can be found in [23]. There are nine reversible and two irreversible reactions in the original network. The complex graph corresponding to the original network is given in the left-hand panel of Figure 3. The concentrations of the important species participating in the original model are represented in Figure 4. We observe that it contains two linkage classes consisting of more than one reaction. The remaining six linkage classes consist of only one reversible reaction. In this case, the reduction method proposed in [5] can be meaningfully applied to delete four intermediate complexes, namely, X 4 , X 15 , X 18 , and X 19 . Further reduction using the same procedure would lead to the deletion of reactions, which is not desirable in the sense of preserving the original behavior. However, the use of our new approach results in a meaningful reduction without causing a significant change in the original behavior. Detailed explanation of the species elimination procedure is included in Appendix A.2.
Step 3 of our reduction procedure chooses the species X 1 , X 11 , and X 12 to be eliminated from the network, i.e., I = {1, 11, 12}. Similar to the model of NSCR, we use three conservation laws to eliminate these species. Again, as a downside of our reduction procedure, our final reduced model of HSP is valid only for trajectories for which the three conserved quantities have fixed values given by (A6). For example, the first of these conserved quantities is the total concentration of the pool of ADP (adenosine diphosphate) and ATP (adenosine triphosphate), and it is reasonable to assume that in an experimental set-up this pool has a fixed concentration.The complex graph corresponding to the equivalent network is shown in the middle panel of Figure 3.
Finally, the procedure described in Step 6 results in X 4 , X 8 , X 13 , and X 18 to be the optimal combination of the intermediate complexes for deletion, i.e., the maximal combination of complexes with an error integral of at most 15%. Deletion of these complexes by the procedure of [5] leads to the reduced network with the corresponding complex graph given in right-hand panel of Figure 3. We remark that, even though 33.33% of the species have been eliminated from the model, the resulting error integral is only 6.59%. The concentrations of the important species participating in the reduced model are represented in Figure 4.

Conclusions and Discussion
In this paper, we have illustrated a new approach to the model reduction of CRNs, which extends the method proposed in [5]. Our method is based on eliminating concentrations of certain species from the mathematical model corresponding to the network using the conservation laws obtained from the model. This elimination allows us to rewrite the complex graph of the network in a preferred form, for which the model reduction method proposed in [5] becomes meaningfully applicable. Even though the procedure results in reducing the order of the corresponding mathematical model, new parameters depending on the vector of initial concentrations are added to the model.
We have implemented an automated process for our model reduction method, which is divided into seven steps. We created a MATLAB file corresponding to each step and all files are included in a single MATLAB library. The information of a given CRN is provided by the user and subsequently used to reduce the corresponding mathematical model in a fully automated manner. The error integral defined in [5] is used to measure the difference between the behaviors of the original model and the reduced model. An important tool in the definition of the error integral is the settling time of the network, which was computed manually in [5]. We have shown how to automatically determine the settling time of a given CRN using its steady state. We have also been able to determine the intermediate complexes in a fully automated manner using the incidence matrix of the network. The MATLAB library consisting of all the files corresponding to our step-wise reduction procedure is provided as Supplementary Material.
We have successfully applied our model reduction method on two real-life examples: neural stem cell regulation and the hedgehog signaling pathway. For each case, we have used our techniques to enable the reduction method proposed in [5] to be applied to a significant extent. We have been able to automatically join several linkage classes of the network by eliminating concentrations of certain species from the corresponding mathematical model. Table 1 provides the percentage of deleted species and the value of the corresponding error integral for each reaction network. The detailed explanation of the reduction procedure corresponding to these models can be found in the Supplementary Material. Table 1. The amount of deleted species and the value of the error integral (in%) corresponding to each example after its model reduction using our method.

Model Deleted Species (%) Error Integral (%)
NSCR 33.33 4.85 HSP 33.33 6.59 The major tools in our automated reduction method are the linearly independent conservation laws, which are used to eliminate the concentrations of species with index set I from the network. The identification of the set I is straightforward. However, in some cases, not all the selected species can be eliminated from the network because of the lack of suitable conservation laws, i.e, in the case when the number of the linearly independent conservation laws is less than the number of elements in I. In such cases, we select I in a way that the number of its elements is equal to the number of linearly independent conservation laws of the network. This adjustment ensures that the species with index set I can be eliminated from the network. Note that, even though this elimination rewrites the complex graph of the network in a preferred form (i.e., increases the number of intermediate complexes in it), new parameters (conserved quantities) are added in the corresponding mathematical model.
Although the procedure of elimination of species using conservation laws described in this paper improves the extent of applicability of the model reduction method in [5], it is no longer valid for all trajectories of the original model. As mentioned earlier, the reduced model is valid only for trajectories with fixed values of these conserved quantities. This aspect is similar to the case of the Michaelis-Menten approximation of enzyme kinetic networks which depends on a conserved quantity, namely, the total enzyme concentration.
One of the main advantages of the model reduction procedure proposed in [5] is that it preserves the governing laws of the CRN. For instance, if a CRN is governed by Michaelis-Menten kinetics, then the reduced CRN by the approach in [5] will again be governed by Michaelis-Menten kinetics. However, this is not the case for our automatic reduction method due to the rewriting procedure, as we have explained in [8].
In [5], the behaviors of the species' concentrations participating in the reduced model are compared with the ones participating in the original model using the error integral (9) defined over a manually set timescale. In this paper, the same procedure has been adopted with the significant modification that the time length of the trajectory is automatically determined for models with unique steady state. We only consider CRNs that are asymptotically stable around a unique steady state. The importance of asymptotically stability arises in determining the settling time automatically. Note that for CRNs that are not asymptotically stable, the settling time cannot be automatically determined. For instance, in the case of oscillatory networks, it is not clear what time frame should be considered for comparing the dynamics therefore its automatic determination is not straightforward. However, this does not mean that the application of our reduction procedure for such CRNs is not possible. In the case of CRNs that are not asymptotically stable, the determination of the time-scale for comparison of the dynamics of the reduced and the original CRNs can still be done manually based on the user preference.
The error integral, as defined in [5], strictly depends on a particular trajectory. Considering an ensemble of trajectories would seem a better choice for more robust reduction. However, in this case it is not clear which subset of trajectories to choose from an infinite number of trajectories for comparing the original model and the reduced models. Moreover, such a reduction requires a higher computational effort, which is not desirable as we aim at using a model reduction method that is computationally less intensive.
The main principle of our model reduction method is similar to the one of QSSA, which is currently the state-of-the-art reduction technique for networks consisting of enzyme-catalyzed reactions. Our method, as well as QSSA applied for deriving the Michaelis-Menten kinetics, make use of combinations of conservation laws and the principle of complex balancing to reduce the number of complexes from the corresponding complex graph. We use the conservation laws to join certain linkage classes of a given network, which allows us to apply the procedure of [5] for complementary model reduction. On the other hand, when applied for deriving the Michaelis-Menten kinetics, QSSA uses the total enzyme (which is distributed between the enzyme and the intermediate enzyme complexes) in terms of a conservation law to eliminate the concentrations of enzyme species from the corresponding mathematical model.
Our approach to the model reduction is essentially QSSA applied to complexes rather than species. It would seem as though the settling time of complexes would be another choice (rather than the error integrals) for selecting the complexes to be deleted from the network. This could provide a more robust ranking of candidate complexes to be deleted as such a ranking might be independent of the chosen trajectory. However, even though the settling time can be used for ranking the candidate complexes according to their speed of reaching steady state, it is not clear which subset of candidate complexes should be considered ultimately for deletion because we still need a measure for quantifying the difference between the reduced model and the original model.

Conflicts of Interest:
The authors declare no conflicts of interest.
Similarly, we can rewrite the linkage class (A1) as We therefore join R 1 , R 2 , R 7 , and the linkage class (A3) into a single linkage class: The equivalent network consists of four independent subnetworks, namely, R 8 , R 9 , R 10 , and the linkage class (A4). We apply the reduction procedure of [5] to the linkage class (A4) to eliminate the optimal combination of its intermediate complexes. We remark that the first three independent subnetworks remain unchanged after reducing the linkage class (A4). There are five intermediate complexes, namely, X 4 , X 7 , X 8 , X 13 , and X 21 , which are candidate complexes for deletion using the method proposed in [5]. The iterative test described in Step 6 in the main paper selects {X 4 , X 8 , X 13 , X 21 }, to be the optimal combination of intermediate complexes for deletion. Therefore, the principle of complex balancing is used to delete these complexes from the linkage class (A4). Even though seven species (33.33%) have been deleted from the original model, the error integral is only 4.85%. From the structure of the Laplacian matrixL( x) (see Equation (15) of the main paper) we derive the complex graph of the reduced linkage class: Table A1 provides a quantitative comparison of the original model of NSCR and the corresponding reduced model. Comparison of concentrations of important species participating in the original model and the reduced model is illustrated in Figure 4 of the main manuscript. Table A1. Quantitative comparison of the original model and the reduced model of neural stem cell regulation.
The remaining five linkage classes consist of only one reversible or irreversible reaction. Clearly, in (A5) there are four intermediate complexes, namely, X 4 , X 15 , X 18 , and X 19 . In this case, the reduction method proposed in [5] can be meaningfully applied to delete the optimal combination of these complexes. Further reduction using the same procedure would lead to the deletion of reactions, which is not desirable in the sense of preserving the original behavior. However, we show how to use our new reduction technique for a meaningful complementary reduction without causing a significant change in the original behavior.
Step 3 of our reduction procedure selects X 1 , X 11 , and X 12 to be eliminated from the network using the following conservation laws.
Note that, similar to the case of the NSCR model, the constants given in the right hand side of the Equation (A6) strictly depend on the vector of species' initial concentrations, meaning that for different values of initial concentrations, we will have different values for these constants. After the aforementioned elimination, we can rewrite the reactions R 1 , R 9 , and R 10 in the following form. R 1 : X 8 ←→ X 2 + X 9 R 9 : X 18 ←→ X 13 R 10 : X 20 ←→ X 19 Similarly, we rewrite the linkage classes (A5) in the following form.
X 13 ←→ X 18 −→ X 4 ←→ X 15 ←→ X 5 + X 14 We therefore join the reactions R 1 and R 2 ; reaction R 4 and the first linkage class of (A7); and reaction R 5 and the second linkage class of (A7). As a result, we obtain three linkage classes with more than one reaction: X 2 + X 9 ←→ X 8 ←→ X 10 X 21 ←→ X 13 ←→ X 18 −→ X 4 ←→ X 15 ←→ X 5 + X 14 (A8) The equivalent network contains six intermediate complexes X 4 , X 8 , X 15 , X 18 , X 19 , and X 20 , which form the set of candidate complexes for deletion from (A8) by the approach in [5]. Finally, the procedure described in Step 6 in the main paper selects {X 4 , X 18 , X 8 , X 13 } to be the optimal combination of the intermediate complexes for deletion, i.e., the maximal combination of complexes with an error integral of at most 15%. The reduced linkage classes are X 2 + X 9 ←→ X 10 X 21 −→ X 15 ←→ X 5 + X 14 (A9) Even though 33.33% of the species have been eliminated from the network, the resulting error integral is only 6.59%. Table A2 provides a quantitative comparison of the original model of HSP and the corresponding reduced model. A comparison of concentrations of important species participating in the original model and the reduced model is illustrated in Figure 4 of the main manuscript.