Exact Reduction of the Generalized Lotka–Volterra Equations via Integral and Algebraic Substitutions

: Systems of interacting species, such as biological environments or chemical reactions, are often described mathematically by sets of coupled ordinary differential equations. While a large number β of species may be involved in the coupled dynamics, often only α < β species are of interest or of consequence. In this paper, we explored how to construct models that include only those given α species, but still recreate the dynamics of the original β -species model. Under some conditions detailed here, this reduction can be completed exactly, such that the information in the reduced model is exactly the same as the original one, but over fewer equations. Moreover, this reduction process suggests a promising type of approximate model—no longer exact, but computationally quite simple.


Introduction
Consider an environment in which a large number of species interact.This could be, for example, a chemical reaction (with reacting chemical species) or an ecological system (with interacting organisms).To model the behavior of interacting species in many such environments, we often use the generalized Lotka-Volterra equations-a set of coupled ordinary differential equations (ODEs) [23].The generalized Lotka-Volterra (GLV) equations extend 2-species predator-prey models common in undergraduate differential equations classes to any number of species.They are in fact much more general than a predator-prey type system: they provide a framework to describe the time-dynamics of any number β of interacting species, allowing for linear (growth rate) and quadratic (interaction) terms.
In many applications, a large number of species may play a role in the dynamical community behavior.However, from a modeling perspective, the species of interest or of consequence may be restricted to a much smaller subset of α species, where α < β.This occurs in fields as diverse as ecology [25], epidemiology [3,12], and chemical kinetics [26,10,4].There are good reasons to omit many species.First, a modeler may not have access to data about certain species' concentrations, which would be needed to define initial conditions or calibrate additional model parameters.Second, the modeler may not actually know what species should be included.Third, the more species included in the model, then in general the more computationally expensive the model becomes.Except in very special (and usually unrealistic) cases, a system of GLV equations will not admit a closed-form solution; instead we solve these systems with computational models.Therefore, it is common to build reduced models which include only α < β given species.
One immediate question that arises in this context is the following: Given a system of β species, suppose only α are known, or of interest.What is the best reduced deterministic model, in terms of only those given α species?Of course, to answer this question we must define what is meant by "best."To do so, let us first consider the landscape of computational models-one of science, mathematics, and computation.We will broadly classify this consideration into three major areas: model validation, model reduction, and computational implementation, and what the major questions are in each.(Point III is closely related to model verification, a process which checks that the computational model solves the mathematical model correctly [18].)Returning to the question of the best reduced model: the best reduced model would address all three areas, i.e., well-represent the scientific system under study (I); recreate the dynamics of the high-fidelity model (II); and be computationally easier to solve than the high-fidelity model (III).
The first topic, model validation, is a recently growing and still very open field.For a general description, see, for example, [19].For a few specific works, see [2,16].In this paper, the original model of β species represents the true (physical, chemical, biological, etc.) system under study.
In many types of model reduction of systems of differential equations, the goal is to reduce the computational cost (III) while controlling the incurred error (II).This certainly makes sense as an objective-one would expect a reduced model to offer computational savings over the original model.In order to gain computationally efficiency, these types of reductions are not exact but instead offer an approximation of the species behavior.For example, eigendecompositions may yield an approximation of the static equilibrium state and not the dynamical behavior.Other techniques reduce the computational complexity while maintaining time-dynamics, such as volume averaging [17], perturbation theory [5], spectral analysis [13] and separation of fine and coarse scale variables [24].Related work identifies the exact dynamics on the relevant slow manifold, using, for example, fast-slow decompositions [8], or polynomial approximations up to a specified accuracy [11].Identification of symmetries can reveal reduced exact dynamics in initial value problems [27] and the KdV-Zakharov-Kuznetsov equation [22].For a good overview of more methods, see [20].
In contrast, here our goal is to reduce the number of coupled equations that make up the model, or equivalently, the number of species involved in the model.Variables are eliminated in a way that is exact, that is, without loss of information.In terms of the points above, (I) the model is an exact representation of the true system, and (II) the reduction incurs zero error.We investigate two possibilities for this type of dimension reduction in the context of the generalized LV equations, called integral and algebraic substitution.There are two defining characteristics of these methods.First, they preserve the correspondence between the set of α species of interest as they appear in the original model and the resulting set after the reduction occurs.This property is termed species correspondence, or simply correspondence.Second, they create a reduced model which contains the exact same information as the original one, but with fewer equations.Although the resultant model is not necessarily better in a computational sense, this method reveals a path towards model reduction that preserves correspondence, closely approximates the dynamics, and is computationally quite simple.
The two methods discussed here do not automatically find the α species comprising the reduced set, but rather assume them given.Some techniques do choose this set as a step within the reduction process itself.Doing so may favor species with high relative concentrations, or those with slow dynamics, for example.But consider a combustion model in which we are concerned about trace amounts of a contaminant, or an ecological model built to track a species near extinction.The methods presented here allow the modeler to include these critical species a priori.
The rest of the paper is structured as follows.Section 2 outlines the generalized Lotka-Volterra equations.Section 3 presents our main results about two types of exact dimension reduction.Based on these results, Section 4 presents some possible approximate methods, and we conclude with Section 5.

Background: The generalized Lotka-Volterra equations
Let us briefly review the GLV equations.Let x be the β-vector of species concentrations.Here, units refer to the number of specimens per unit area, but specific units are omitted for this paper.In the framework of generalized Lotka-Volterra equations, the system of ODEs is given as where the β-vector b is the intrinsic growth rate vector, and A is the β × β interaction matrix.Note that there is one differential equation for each species variable.In particular, if equilibrium is achieved, then the equilibrium solution is given by Moreover, there is coexistence, or a feasible equilibrium, if, at equilibrium, xi > 0 ∀i ∈ 1, . . ., β.
Here it is assumed all xi(t) > 0, t ≥ 0. Conditions on the entries of A and b that guarantee coexistence are given in [1]; the authors study the effects of intra-(within a species) and inter-(among different species) specific competition on coexistence of large systems.In [7], the authors explore conditions for stability under perturbations, asymptotic feasibility, and equilibrium of large systems.

Exact dimension reduction
We begin with some definitions.Let α, β ∈ Z + .Defintion 3.1 ((β, α)-reducible).A system of β differential equations that can be converted to a set of α differential equations, where α < β and without loss of information, is called (β, α)-reducible.Defintion 3.2 (Integral substitutions).Integral substitutions (IS) may allow for reductions from a system of β equations to α by eliminating the variables xα+1, . . .x β .During this process, we introduce the entire history, or memory, of a subset of the variables x1, . . .xα.
This approach is similar to the Mori-Zwanzig method of model reduction [6].Defintion 3.3 (Algebraic substitutions).Algebraic substitutions (AS) may allow for reductions from a system of β equations to α by eliminating the variables xα+1, . . .x β .During this process, we introduce higher-order derivatives of a subset of the variables x1, . . .xα.
With β = 2, the original model is: The goal is to rewrite x2 in terms of x1 in equation 3.1a, specifically in the term a12x2x1.First, rearranging equation 3.1a gives We denote this quantity y 1 2 because it is a representation of x2 that only depends on x1.With regards to notation, bold type indicates any such introduced variables to more easily distinguish them from the xis.Also, a subscript indicates which variable the new one is replacing, and the superscript shows which variables this new one actually depends on.Now, substituting y 1 2 back into 3.1a yields 0 = 0. Instead, let's substitute y 1 2 for x2 in 3.1b: Integrating, we have, Similarly, the symbol χ 1 2 means this is a variable that is equivalent to x2 but only in terms of x1.Finally, 3.1a becomes: We now have a system of a single differential equation, in terms of x1 and its memory.Note that this process has preserved species correspondence and that there is no loss of information: the variable x1 and its derivative in equation 3.5 are equivalent to that in 3.1a.Example 3.2.The GLV equations are also (2, 1)-reducible via AS.
The first step here is the same as that of the previous subsection, yielding y 1 2 .Next, however, by differentiating y 1 2 , we have: A major difference between the two methods is revealed by inspection of the structure of the resulting sole equations.After reduction via the integral method, the structure of the final equation resembles that of the initial equation 3.1a for x1.The functional form matches in the placement of the variables x1 and χ 1 2 (in place of x2), and also the remaining constants b1, a11, and a12 (which do not appear in the second equation of the original system).In contrast, after reduction via the algebraic method, the resultant equation has the structure of 3.1b.This suggests that in an applied setting, one type of reduction may be advantageous, depending on what information is known about the high-fidelity model, such as various model parameters.
For the remainder of the paper, we will focus on IS, as opposed to AS.The techniques and results between the two methods are quite similar.An advantage of IS is its notational convenience (as described in Remark 3): the reduced equations preserve the structure of the original ones, which simplifies tracking the reductions, in theory.Remark ((β, β − 1) − reducibility).In a similar way, we can reduce any generalized LV system of β species to one of β − 1 species.Note that this level of reduction, from β to β − 1 equations, can happen when the ODEs describe the dynamics of fractional concentrations.In that case, the extra constraint that β i=1 xi = 1 readily allows for a reduction to β − 1 equations, since, for example, x β can be expressed as In this work, however, there is no such restriction on the concentrations, and yet such a reduction is always possible.
Proof.The original model for β species is: At this point, we want to rewrite x β in terms of the remaining β − 1 variables, but we could use any of the first β − 1 equations to do so.Without loss of generality, we choose the second to last one: The colon notation in y 1:β−1 β signifies that this new variable is written in terms of variables x1 through x β−1 .Now, substituting y 1:β−1 β into 3.8c yields And finally, substituting χ 1:β−1 β back into the first β − 1 equations: Thus the set of β ODEs is reduced to β − 1 without loss of information.
In general, a system of GLV equations is not (β, α)-reducible because of the coupled-ness, or "entanglement" that occurs between the species in the reduced set via the introduced y and/or χ variables.To see this, consider the case of β = 3, α = 1.With both the integral and algebraic methods, the model cannot be exactly reduced, unless one of the coefficients is zero, thereby breaking the entanglement.Lemma 3.2.The GLV equations are (3, 1)-reducible if only one aij = 0, i ̸ = j.
Proof.We begin the process to reduce the system of three equations to one by eliminating the variables x2 and x3.When β = 3, the original model is: Repeating the process described in the previous section, we can rewrite either equation 3.11a or 3.11b for x3.Using equation 3.11b, then Next, substituting y At this point the model has been reduced from three equations to two.Consider if we now tried to remove another variable, say x2.The first step would be to rewrite equation 3.15a so that x2 is alone on the left-hand side (LHS).However, now that χ 1:2 3 has been introduced, we cannot cleanly separate x2 out of the right-hand side (RHS) since it is embedded inside the integral term.
(3.25) So, we can exactly reduce the system of three ODEs to one differential equation if we assume a13 = 0.However, note that a nested set of integrals comprises the term χ 1 3 .Indeed, such a system of one differential equation with nested integral terms as equation 3.25 would likely be more difficult to solve than the original set of three coupled ODEs shown in 3.11a -3.11c.Remark (Entanglement).Multiple variables can be eliminated if enough information (memory terms, higher derivatives) is known about the remaining variables.Lemma 3 shows the necessity of the assumption that some interaction term be zero in order to exactly reduce (in this demonstration, that a13 = 0).This assumption begins to reveal the limitations of such reductions via substitutions-in particular, how, after a single substitution, we cannot escape the entanglement of the remaining species.
At the same time, the methods here give insight into how one might approximate the role of eliminated variables in terms of the reduced set.For example, in the case of β = 2, α = 1, consider an approximation χ1 2 in terms of x1 and this extra information: where K(x1) represents some memory kernel of x1.We explore this type of approximation in Section 4.
But first, we complete the reduction via IS for general β and α.Of course, in this setting, a larger collection of interaction terms must be assumed zero.Define the set of coefficients A β,α , so that A β,α = {aij}, i = α, . . ., β − 2; j > i + 1. (3.27) Proof.In Lemma 3.1, we completed the reduction from β to β − 1 equations.The next step reduces to β − 2. Let's start with the β − 1 equations: Now we rewrite x β−1 in terms of the remaining β − 2 variables; again, we could use any of the first β − 2 equations to do so.Let's choose the ODE for x β−2 with the assumption that a β−2,β = 0: And finally, substituting χ 1:β−2 This process can be repeated once more, yielding a system of β − 3 equations, if we also assume a β−3,β = a β−3,β−1 = 0.And we may find the equivalent system in β − 4 equations if the set of zero interaction terms also includes a β−4,β , a β−4,β−1 , and a β−4,β−2 .Continuing in this way, the model can be reduced from β to s equations if a = 0 ∀a ∈ A β,s .Note the assumption that a ̸ = 0 ∀ a ∈ A c β,α allows for division by these coefficients.
Let us examine this set A β,α in more detail.We compute the fraction ρ of these "zeroed" terms out of all interaction coefficients.Corollary Proof.Given β species in the original model, there are β 2 interaction terms, and . In terms of α, β, then ρ is: Then the fraction ρ may be written as In the limit as β → ∞ and for a fixed λ, we have This limiting value is plotted for λ ∈ [0, 1] in figure 1.Note that in this limit, when λ = 0, i.e., there is complete reduction, then half of the interaction terms vanish.At the other extreme, when λ = 1, i.e., α = β so there is no reduction, then of course all interaction terms can be nonzero.
Note that the conditions on the set A β,α and its complement are not necessary, but sufficient.They are not necessary at least in the sense of non-uniqueness: reordering the variables or making different choices about which substitutions to make would lead to a different set of zeroed interactions terms.In any case, we end this section with the following conjecture.The fraction of interaction that must be zero for exact reduction decreases nonlinearly as α → β.

Approximate model reduction
Motivated by the previous results, the final topic of this paper explores approximate model reduction.In this section, we consider the situation that information-interaction coefficients, growth rates, and some observations-is only available about a subset of α out of β species.We aim to create an approximate model for those α species without including any information about the excluded β − α variables.

Algebraic substitutions
In the algebraic method, eliminated variables are exchanged for higher derivatives about the remaining variables.Motived by the AS above, we write x2 as an expansion in x1 and its higher derivatives: This leads to an approximate model for x1: where δ0 = a11κ0 and δ1 = a11κ1.It may also be of interest to see the behavior of x1 given the remaining terms from the original model, i.e., ẋ = r1x1 + a11x 2  1 .We call this the naive reduced model.Let us specify the coefficients of the original model as: and set the initial conditions as x1(0) = 0.289 and x2(0) = 0.665.(These values are approximate.The precise initial conditions were chosen randomly from a standard lognormal distribution.) The introduced coefficients δ0 and δ1 are unknown and must be calibrated.We assume data about x1 over time.Calibration is performed under a Bayesian framework; the posterior means are δ1 ≈ −0.280 and δ2 ≈ −0.357.Code to run Bayesian inverse and forward problems is available here: github.libqueso.com[21].The trajectory of x1 is shown in Figure 2 from the three models: original, naive reduced, and approximate.The approximate model is quite close to the original.Importantly, the approximate model   ).Here we try the same framework with a slightly larger reduction: from 3 to 1.In this case, note that we still replace x2 and x3 with our approximate expression, but still only two new coefficients need be introduced.
Recall the original model for x1 is: The approximate model is then: where δ1 = a12κ1 + a13γ1 and δ2 = a12κ2 + a13γ2.Thus, we can again calibrate simply δ1 and δ2.
Note the naive reduced model is again just Specifically, we set and the initial conditions to x1(0) = 0.289, x2(0) = 0.665, and x3(0) = 5.01.The posterior means are δ1 ≈ −0.311 and δ2 ≈ −0.432.The trajectories from the three models are plotted in Figure 3. Again, the approximate model is a good match to the original for x1.Note that ten coefficients from the original model are omitted (eight interaction and two growth rate coefficients), while only two new parameters are introduced, δ1 and δ2.

Integral substitutions
Another approach is instead inspired by the IS above, so that information about eliminated variables is replaced by memory, or integral, information of the remaining species.we aim for an approximate model by eliminating x2.
In this example, we also try a slight variation from the algebraic one in that we replace the entire term containing x2: Moreover, since, in all of these examples, we assume some data about the species of interest, we can approximate this integral directly from the data, which is generated according to the original model (and not the x1 as given by the approximate model).In these numerical results, this integral is estimated numerically using a simple trapezoid rule.Call this numerical approximation I(t): where the * in x * 1 indicates that this integral is estimated from the data generated by the original model.Then, we pose the approximate model for x1 as ẋ1 = r1x1 + a11x 2 1 + δ1x1 + δ2I(t).(4.9) The posterior means in this case are δ1 ≈ −0.778 and δ2 ≈ −0.449.The different trajectories for this example are shown in Figure 4. Example 4.4 (Approximate IS for β = 3, α = 1).Here we present the final example, with the same original model as Example 4.2 (approximate AS for β = 3, α = 1), and the same approximate model as the previous example.
The results are shown in Figure 5.The approximate integral models also show quite good agreement with the original model.
Note that each example here has preserved species correspondence: the species of interest is x1, and that is precisely what is modeled.This would readily generalize to approximate models for α > 1.And while the resulting differential equations now include either a derivative or an approximate integral, the numerical solution is rather easily obtained without introducing much numerical complexity.[14] provides an expanded investigation into similar AS-type approximate models.

Discussion
In summary, we can reduce a GLV system exactly by introducing integral terms of the remaining species, or by introducing their higher derivatives.In a sense, some variables may be eliminated in exchange for this extra information about the reduced set.A few interesting results emerge from this work: First, a reduction from β to β − 1 variables is always possible.Second, a reduction from β to α is possible if the interaction terms in a specified set are zero.With λ = α/β, the fraction of terms in this set is approximately (1 − λ) 2 /2.Third, we saw that the two different methods in the β = 2 case yielded a single ODE for x1, but either by maintaining the structure of the original ODE for x1, or for x2.Fourth, after reducing to β − 1 variables, an entanglement of species prevents further (exact) reduction without breaking the original model in some other way.
In this paper, this break was achieved by setting interaction terms to zero.Of course, in some systems, this may be a reasonable assumption.This work also serves as motivation or evidence of how to approximate the role of the eliminated species using only the reduced set.Preliminary results suggest reduced models that are no longer exact, but still preserve species correspondence and are computationally feasible.Specific implementations and their effectiveness is an open and rich area for future study.Supplementary.To run forward and inverse problems for the approximate models, code is available here: github.com/rebeccaem/enriched-glv[15].

I
Model validation.Does the mathematical model adequately represent the scientific system in question?II Model reduction.How much error is incurred by use of the reduced model compared to the original, or high-fidelity, model?III Computational implementation.Is the reduced model less computationally expensive than the original model, and by how much?

Figure 1 :
Figure1: The fraction of interaction that must be zero for exact reduction decreases nonlinearly as α → β.
output completely covers the original model output, which lies entirely within the approximate model 50% confidence interval.Example 4.2 (Approximate AS for β = 3, α = 1.
Remark (Resulting functional form).Both the integral and algebraic forms, manipulated in this way, yield a single differential equation in terms of x1.Both methods respect species correspondence and are exact.