Stochastic Modeling of Plant Virus Propagation with Biological Control

: Plants are vital for man and many species. They are sources of food, medicine, ﬁber for clothes and materials for shelter. They are a fundamental part of a healthy environment. However, plants are subject to virus diseases. In plants most of the virus propagation is done by a vector. The traditional way of controlling the insects is to use insecticides that have a negative effect on the environment. A more environmentally friendly way to control the insects is to use predators that will prey on the vector, such as birds or bats. In this paper we modify a plant-virus propagation model with delays. The model is written using delay differential equations. However, it can also be expressed in terms of biochemical reactions, which is more realistic for small populations. Since there are always variations in the populations, errors in the measured values and uncertainties, we use two methods to introduce randomness: stochastic differential equations and the Gillespie algorithm. We present numerical simulations. The Gillespie method produces good results for plant-virus population models.


Introduction
Viruses cause a great number of diseases in plants. In [1] the authors gave a review of the top ten viruses. In plants the most common way that viruses are propagated is by means of a vector, usually insects. A carrier insect will bite an infected plant, thereby infecting it [2]. Mathematical models of virus-caused diseases in plants can be used to better understand the processes involved [3][4][5][6][7]. Viruses take time to propagate and replicate. These effects can be taken into account by considering latent populations [8,9]. In this paper we present three different mathematical models of virus propagation in plants with a predator used as a biological control of the insects that transmit the virus [10]. The description of the model is given in terms of differential equations and also biochemical style reactions. Models based on differential equations are built on the assumption that the number of individuals is very large, which is not always the case. The model that we use incorporating delays in the spread of the virus in both plants and vectors is based on [11]. An extension to optimal control through the use of predators and insecticide is in [12]. The authors in [13] introduced a delay to the model in [8] to account for the incubation period of the plants. In [14] there is a different model with delays. Populations have variabilities and there are errors in measuring and estimating the parameters involved. The variations and uncertainties in the populations and environmental conditions can be modeled by introducing randomness or stochasticity into the models. One common way is to use stochastic differential equations [15,16]. A second way is to consider that some of the coefficients in the model are random variables, as in the method of polynomial chaos [17,18]. Another method is to consider discrete populations and work with Markov chains [19][20][21], including working with the master equation [22]. Models based on continuous time Markov chains usually use the stochastic simulation algorithm of Gillespie [23,24]. Very complex models of plant viral assembly have been presented in [25,26]. Mathematical models of populations involve a series of simplifying hypotheses [27]. One of them is that the individuals in each population group have the same properties. Another usual one is that the populations are homogeneously distributed. A third one is that the number of individuals is large. Based on these simplifications, the populations can be assumed to be continuous and the model can be described in terms of differential equations or maybe delay differential equations. An alternative method of developing mathematical population models is to consider the interactions between individuals to be a reaction. For example, an individual of the class susceptible S dying can be described by the following reaction: S k − → . Here the blank space denotes an empty population. Using differential equations, it would be dS dt = −kS. Additionally, the interaction between a susceptible S and an infective I leading to a new infective is S + I β/N −−→ I + I, while the corresponding differential equation is dS dt = −βSI/N. k is the death rate, β/N is the infection rate divided by the total population N in the differential equations and both are the reaction rates in the corresponding reactions. This is the formulation preferred by biology and chemistry researchers. By dividing the time period of interest into subintervals and considering that all the different reactions happen at the same instant in each subinterval, the reaction system can be converted into a system of discrete equations giving the change of each population in the time subinterval. By letting the length of the time subintervals go to zero, a system of differential equations is obtained. However, if the number of individuals is not very large, as is the case in our plant-virus propagation problem or in cell processes in systems biology, the limiting process introduces errors, and the assumption that all reactions occur simultaneously is not a good one. An alternative is to consider the reactions as continuous time Markov chains [28]. This assumption leads to the stochastic simulation algorithm, or Gillespie algorithm [23,24], and its variations.
We apply this method to a simple susceptibles (S), infectives (I) and recovereds (R) (SIR) epidemic model and to a predator-prey model. The objective of presenting these two simple models is to introduce the basic ideas of writing ordinary differential equations as biochemical reactions. Then we apply them to our main interest, which is a model of virus propagation consisting of six populations, susceptible plants, infective plants, recovered plants, susceptible insects, infective insects and predators. For this virus propagation model, we consider three cases: all the interactions are mass actions-some of them are saturated and some are saturated with delays. The saturated interactions are modeled using Holling type 2 functionals [29]. We also write the models as a system of reactions. Simulations were performed using the stochastic simulation algorithm. For comparison purposes, the corresponding system of differential equations is solve numerically, and finally white noise is added to the differential equation systems to obtain stochastic differential equations and simulate them numerically. The rest of the paper is organized as follows. In Section 2, the development of the models is presented, as is the addition of stochasticity to the models to take into account the variability in the populations and other uncertainties always present in the processes modeled. Next, the virus propagation models and the numerical methods are described. Section 3 shows sample numerical simulations. In Section 4, the results are discussed. Finally there is a conclusions section.

Mathematical Modeling
Mathematical models for populations are usually given in terms of differential equations or difference equations. The differential equations may be ordinary, delay, partial or even fractional. However, the models may also be given by describing the interactions between the different population groups and the rate at which they happen. That is, they may be described in terms of reactions with the same structure as biochemical reactions. Therefore, we can write differential equations arising from mathematical biology as reactions. For example dS dt = −kS can be written as S kS − → , where the ban space means that the population disappears. As a second example, dS dt = −βSI is equivalent to S + I βSI − − → I + I. Sometimes the quantity written on top of the arrow is the number of times the reaction occurs in a population (propensity), and sometimes only the reaction rate constant is included. We use the propensities.
As an illustration on the basic ideas of converting differential equations into biochemical reactions, consider the well known SIR model giving the interaction of susceptibles (S), infectives (I) and recovereds (R) [27]: In writing the model as a system of reactions, we obtain a reaction for each distinct term on the right hand-side of the differential equations. In this example we have two such terms, βSI N and γI. The first term has the reaction of S and I. The second gives the change of I to another population. If the differential equation that has the derivative of a population includes the given term on the right-hand side with a positive sign, then the product of the reaction is that population. Hence, by writing the model as reactions between the populations we have The first reaction is the conversion of one S into one I, and the second one is the conversion of one I into one R.
The system of reactions is not unique. This non-uniqueness has been established by [30]. In this case a more complex system is In this second set of reactions, S loses one member first and in the next reaction I gains one. The reaction happens in two steps. We use the simpler formulation with S converting directly into I. However, below we will show that both formulations give the same system of differential equations.
To convert the model given in terms of reactions to a differential equation model, consider a time interval [t, t + δt] and assume that all the reactions occur at the same instant in this interval. For the reaction system (3) we have Assuming that the three populations are large enough, so that all the reactions happen even for very small δt, we can take the limit as t → 0 to get By subtracting the first equation from the second, we obtain the original system of differential Equation (1). Similarly, using the reactions in (2), we obtain (1) directly. In [31], the authors used biochemical reactions to model SIR processes. Reference [32] also included implementations of the Gillespie algorithm for the SIR model.
As a second example, consider a simple predator-prey model as given by Lotka and Volterra [27]: where x is the prey and y is the predator. The corresponding system of reactions is The first reaction is the birth of new prey, the second reaction is the elimination of prey by predators, the third one is the conversion of dead prey into new predators and the last one is the death of predators. The conversion of prey into predators needs to be in two reactions, since one killed prey does not convert into one new predator.
Even though it is straightforward to convert a population model based on differential equations to one described by reactions, there exist software packages that will automate the process. Biocham [33] http://lifeware.inria.fr/biocham4/ (accessed on 12 December 2020) will convert systems of ordinary differential equations with general interactions between populations written in xppauto format [34] to biochemical reactions. Moccasin [35] uses biocham and adds an interface for the MATLAB format of differential equations. Both programs write the equations in SBML [36], a widely used machine-readable description of biochemical reactions.

Stochastic Modeling
In order to produce more realistic results, mathematical models need to take into account the existence of errors in the observed or measured population data; variability in the populations; and uncertainties such as missing data and lack of knowledge.
These uncertainties can be modeled using random differential equations wherein it is considered that the parameters are random variables [37]. Another method is to use discrete or continuous time Markov chain models [38,39]. A different method consists of introducing the uncertainties as white noise and obtaining stochastic differential equations [16,40]. In this paper we only consider the second and third methods.

Continuous Time Markov Chain Models
As mentioned above, population models can be described in terms of biochemicaltype reactions. A Markov chain is a stochastic process in which the probability of an event happening depends on a sequence of possible events, in which the probability of each event depends only on the previous event [28,39]. In a discrete time Markov chain, the changes of state happen at fixed points in time. In a continuous time Markov chain, the changes of state can happen at any time. Each reaction is a random event that has a given probability of happening. This probability is a function of the reaction rate and of the numbers of individuals of the populations involved. Since the reactions can occur at any time, the reactions are continuous time Markov chains. The master chemical equation for a reaction is the time evolution equation for the probability distribution over the state space of a Markov process. This is derived by substituting the transitional probability of the Markov process into the Chapman-Kolmogorov equation [41,42]. For a system of reactions, the master chemical equation is a system of ordinary differential equations that is hard to solve either analytically or numerically [43,44]. An alternative is the stochastic simulation algorithm (SSA) proposed by Gillespie [23] which produces numerical realizations. The next reaction and the time until it occurs are determined by Monte Carlo simulations involving the propensities of the reactions, and the process is repeated. The processes are Poisson processes with exponentially distributed transition times. Improvements on Gillespie's direct method are given in [24,45]. The SSA works with populations which should not be very small. Numerical implementations are, for example, in [46][47][48]

Stochastic Differential Equations
Randomness can be added to an ordinary differential equation dx = f (x, t)dt, x(t 0 ) = x 0 by including a white noise process [16,40]. A stochastic differential equation is given by where ω is an element of the sample space and X = X(t, ω) is a stochastic process. The initial condition X(0, ω) = X 0 is taken to be known with probability one. A Brownian motion or Wiener process is formed by a sequence of random variables parameterized by time that are independent and identically distributed (iid). A stochastic process W(t), t ∈ [0, ∞] is a Wiener process (or a standard Brownian motion) if it satisfies: (i) It is defined for The noise term is called additive if it is independent of the population. This noise is also called environmental noise. If the noise term is proportional to the population, it is called multiplicative. These two are the most common types of white noise used, but they can have more complicated forms [49][50][51][52]. For uniform populations the environmental noise may be dominant. However, populations usually have variations. Demographic stochasticity is usually defined as the variation in the time evolution of a small population due to the randomness of individual birth, death, infection and other rates. However, the term can also be used for populations of any size. It can be related to stochasticity with multiplicative white noise by considering that the rates have a deterministic part plus a stochastic one. Additionally, environmental fluctuations may be related to overpopulation in ways such as shortage of food, increased aggression toward each other, etc. Therefore, a reasonable way to modify the deterministic equation is to consider introducing randomness that is proportional to the size of the population. However, this is just an assumption that needs to be verified, even though it has been commonly used, for example, in [53,54]. The best choice of white noise regarding its magnitude depends on the particular problem and is still an open research question.

A Plant-Virus Model with Biological Control
In this paper, we consider the application of the SSA and stochastic differential approaches to the model in [10]. We consider six different population groups: susceptible plants S(t), infected plants I(t), recovered plants R(t), susceptible vectors X(t), infected vectors Y(t) and predators P(t). Each variable represents the number of individuals in the respective population group at time t. Susceptible plants are healthy but could get the disease if infected with the virus. The infected plants have the virus but can only infect a susceptible plant through a vector. Additionally, the death rate of infected plants is higher than that of susceptible plants, since infected plants can also die from the viral infection. We also assume that farm workers replace any dead plant immediately with a new susceptible plant. Therefore, we can assume that the total plant population remains constant. We will denote this constant by K. Using this assumption, we can simplify the model in the deterministic case, since K = S(t) + I(t) + R(t) can be used to eliminate the recovered population from the system of equations, and thus we can work with only five populations. This cannot be done in the stochastic case. The virus is not present in susceptible insects but they can be infected with it if they bite an infected plant. By biting a susceptible plant, infected insects can transmit viruses to it. Another assumption is that there is no vertical transmission of the virus in either plants or vectors. Moreover, we assume that the the vector does not get sick from the virus and thus it does not defend against the virus and will remain infected for the rest of its life. Therefore, there are no recovered vectors. The predators feed on the vectors and use the resulting energy to increase the number of predators. We also assume predators do not get infected by the virus even if they eat an infected vector. There is also intra-species competition between predators for the insects. A consequence of the infected vectors not being sick is that the predators feed on the infected insects and susceptible insects at the same rate. The interactions between vector and plant and predator and vector are assumed to have a limitation modeled by a predator-prey Holling type 2 functional [29]. For a large number of vectors, the number of infected plants due to the infected vectors tends to saturate as the number of infected vectors increases, which is the behavior of the Holling type 2 functional. In other words, for small number of vectors, doubling the population doubles the number of plants that are infected by the vectors, but for large number of vectors, doubling this number does not double the number of infected plants since there are not enough plants to be infected. Even though there are many other functionals that can be used to model this saturation effect, the Holling type 2 is a simple one.
After an infected vector bites a susceptible plant, it takes time for the plant to be infected, since the virus has to enter the plant cells, replicate, burst the cell and spread in the plant. It also takes time for the virus to spread inside a susceptible insect after it bites an infected plant. Hence, we introduce two discrete delays: τ 1 , which is the time it takes a plant to become infected after an infected bite, and τ 2 , the time it takes a vector to become infected after biting an infected plant. τ 1 is much larger than τ 2 since the infection process is more complex for plants. The assumptions used in the model are: the number of plants is constant, so the recovered plant population can be eliminated from the system of equations; plants die and are infected by infected vectors and converted into infected plants after a delay; infected plants can recover and also die; susceptible vectors are recruited at a constant rate, can die, are infected after biting an infected plant after a delay and can be eaten by a predator; infected insects can die and be eaten by a predator; predators are recruited at a constant rate, grow due to they eating vectors and can die; the interactions between populations saturate according to a Holling type 2 functional.
The model with the two discrete delays is The meanings of the parameters and the values used in the simulations to obtain the results presented in Section 3 are in Table 1, which is based on the data in [11]. P-unit is the number of individuals in the population group. Contact rate between predators and healthy insects 0.05/day/P-unit c 2 Contact rate between predators and infected insects 0.05/day/P-unit δ Natural death rate of predators 0.05/day α 3 Saturation of predators due to insects 0.1/P-unit Λ p Recruiting rate of predators 0.4 P-unit/day α 4 Conversion rate of predators due to insects 0.1 τ 1 Delay for plants 24 days τ 2 Delay for vectors 1 day The equivalent reactions-based system is Note that the delays do not appear explicitly in the reactions. The stochastic differential equations for the virus model are given system (8).

Numerical Methods
Numerical methods for ordinary differential equations can be modified for delay differential equations by integrating piece-wise over time intervals chosen such that they are multiples of the delays [55,56]. A very good numerical solver based on Runge-Kutta methods is given in [56]. In our numerical simulations we used the following values for our parameters: K = 63, β = 0.01, β 1 = 0.01, α = 0.2, α 1 = 0.1, µ = 0.01, m = 0.2974, γ = 0.01, Λ = 10, d = 0.2, c 1 = 0.05, c 2 = 0.05, δ = 0.05, Λ p = 0.4, α 3 = 0.1 and α 4 = 0.1. Additionally, we chose the following initial conditions: S(0) = 59.8478, I(0) = 1.57612, X(0) = 14.6247478, Y(0) = 19.5 and P(0) = 2. We considered the values of the delays to be τ 1 = 24 and τ 2 = 1. The history for the delay equations is usually taken to be constant and equal to the initial values. However, if we consider the model in terms of reactions, the delayed reactions do not happen during the history, so we took the history to be equal to zero. The values of most the parameters were taken from [7] and those referring to the delays and the predator were from [10]. These are not implied to apply to specific plants, vectors and viruses. We were not able to find real values for many of the parameters. Even for the well studied maize streak virus, the model in [57], which uses the parameters presented in [58][59][60], some of the values are assumed.
For stochastic differential equations using Ito calculus, two common methods are the Euler-Murayama and Milstein methods [15]. Both can be easily modified to include delays in a similar way as for ordinary differential equations [61]. The SSA can be modified to include delays in the reactions by adding the corresponding delay to the time when the reaction happens [62,63].
For the deterministic ordinary differential equations we used an Euler method implemented in GNU octave [64]. For the stochastic differential equations the Milstein method was also implemented in octave. Both methods are first order. For the reactionbased method, we used the software stochPy [65], an open source program implemented in Python.

Results
The first simulation is of the SIR model given by differential Equation (1) and reactions (3). The values of the parameters were: N = 63, β = 0.2 and γ = 0.1. The initial values were S(0) = 60, I(0) = 2 and R(0) = 1. Figure 1 shows the deterministic simulation on the left and the SSA simulation on the right. For the SSA simulation, only the average values of the populations are plotted. For the values of the parameters used, the populations tended to the disease-free equilibrium for a large amount of time.
The next simulation was of the predator-prey model described by Equation (4) and reactions (5). The parameter values used were p = 1, r = 1, b = 1 and d = 1. The initial conditions were X(0) = 2 and Y(0) = 2. Figure 2 shows on the left the deterministic simulation and on the right the simulation using SSA. For the SSA simulation only the average values are plotted. Note that the deterministic solution is periodic but SSA is not quite periodic.  The next simulations were for the plant-virus model given by Equation (6) and reactions (7). The stochastic differential equations used were (8). The values of the parameters used in the simulations are in Table 1.
We did three simulations of the plant-virus model. The first one was with mass action kinetics and no delays. That is, α 1 , α 2 , α 3 , α 4 , τ 1 and τ 2 all being equal to zero. All the other parameters were as given in Table 1, with the exception of σ 1 to σ 6 . An open question is how to determine the values of the σs. We did one simulation of the deterministic model, one of the reaction model using the SSA and three of the stochastic differential equation model (8) using three different values of the σ, 0.01. 0.025 and 0.05. Figure 3 shows the results of the deterministic simulation on the left and of the SSA simulation on the right. The stochastic simulation shows the plots for the mean and the mean +/− standard deviation for 1000 realizations. Figure 4 shows the stochastic differential equation simulations results for the mean and the mean +/− one standard deviation for 1000 simulations with all σs equal to 0.01 on the left and with the σs equal to 0.025 on the right. Figure 5 shows on the left the mean and the mean +/− one standard deviation plots for the σs equal to 0.05. From these figures we see that the σs equal to 0.025 gives a stochastic effect that is important but does not overcome the deterministic part. The plots for the mean values of the stochastic simulations were taken for 1000 realizations since the results for 500 and 1000 realizations agree to at least three significant figures.   Table 1 and τ 1 = τ 2 = 0. Figure 6 has the plots of the deterministic simulation. Figure 7 has the plots of simulations using the SSA (left) and stochastic differential equation s (right). For the SSA, the vertical lines represent the intervals [mean − 1 standard deviation, mean + 1 standard deviation]. For the stochastic differential equations plot the mean is given by the solid lines and the dashed lines give the mean +/− one standard deviation for 1000 realizations for the stochastic simulations. On the left for the SSA simulation and on the right for the stochastic differential equation run with σs equal to 0.025.   Table 1 and the delays were τ 1 = 24 and τ 2 = 1. Figure 8 has the plot of the deterministic simulation. Figure 9 has the plots of the mean and of the mean +/− one standard deviation for 1000 realizations for the stochastic simulations. On the left are the results for the SSA simulation, and the vertical lines represent the intervals (mean − 1 standard deviation, mean + 1 standard deviation). On the right are the plots for the stochastic differential equation run with σ equal to 0.025. The solid lines represent the mean values and the dashed lines the mean +/− one standard deviation.

Discussions
The numerical simulations in all cases gave similar results for the differential equation, the stochastic differential and the reaction models. The differential equation simulations give one trajectory and do not take into account the variations in the populations and environment, or the errors in measurements of parameters. However, they are the easiest simulations to implement and the fastest. The stochastic differential equations include variations of the populations when multiplicative noise is added, as we did. By doing multiple simulations, they give the mean values and the standard deviations of the solutions, which are more realistic. However, there are other types of white noise and we need to estimate the size of the noise. The reaction model simulations using the Gillespie algorithm also give the means and standard deviations of the solutions when run for multiple simulations. It has a more solid theoretical base for small populations, and this is our case. Through both the stochastic differential equation method and the Gillespie algorithm there are some outcomes with different behavior than what comes from the deterministic method. For some realizations the disease may not disappear for very long time periods, and in future work it may be worth estimating the probability.

Conclusions
Mathematical population models based on differential equations give realistic results even when the populations are not very large. However, models based on biochemical reactions are more realistic. Models given in terms of reactions are usually easier to understand for biologists, ecologists and other specialists without backgrounds in mathematics. The stochastic simulation algorithm has solid theoretical backing for its use for small populations, and there is a wide variety of existing software implementing it. As with all stochastic simulations, there is the question of how many realizations to use. In our case we did one hundred and then one thousand realizations, and the results have at least three significant digits of agreement. The results for the stochastic differential equations depend on the type and magnitude of the white noise term, and it is usually not obvious which one should be used.
Even though from the first epidemic models it has been known that the models can be written in terms of biochemical reactions [66], their use and the use of Gillespie's algorithm should be more frequent, since in epidemic models the populations are many times not very large and the continuity hypothesis is not justified. Second, this method is easy to implement and does not require one-as the stochastic differential method does-to decide on the type and magnitude of the white noise. There are many papers dealing with the use of Gillespie's algorithm for epidemic models and others on using biochemical reactions and other methods. For example, in [67] the authors used a biochemical reactions formulation and cellular automata. In [68], the authors mentioned that epidemic models can be described as chemical reactions but then used a stochastic model based on binomial drawing for certain terms. In [31] the equivalent reactions for a SIR model are given. In some other papers the Gillespie algorithm was applied to variations of the SIR method [69][70][71]. Reference [72] presents continuous time Markov chain and stochastic differential equation models, including an SIR model and a four-population model for malaria. For this last model, the author approximated the branching process by working with the forward and backward Kolmogorov equations. In [70] the model used was an SEIR (Susceptible, Exposed, Infectious, Recovered) with added numbers of patients hospitalized and killed regarding Ebola. To the best of my knowledge, the publications about the applications of the Gillespie algorithm to epidemic models do not include models with saturation interaction terms, particularly with Holling type 2 functionals, or epidemic models with discrete delays; they also do not involve plant diseases caused bv viruses and spread by vectors. The comparison of the results for the model using stochastic equation simulations with different sizes for the white noise is also new. Future work will include applying the application to more complex models and to models with time varying rates.