Fractional Stochastic Differential Equation Approach for Spreading of Diseases

The nonlinear fractional stochastic differential equation approach with Hurst parameter H within interval H∈(0,1) to study the time evolution of the number of those infected by the coronavirus in countries where the number of cases is large as Brazil is studied. The rises and falls of novel cases daily or the fluctuations in the official data are treated as a random term in the stochastic differential equation for the fractional Brownian motion. The projection of novel cases in the future is treated as quadratic mean deviation in the official data of novel cases daily since the beginning of the pandemic up to the present. Moreover, the rescaled range analysis (RS) is employed to determine the Hurst index for the time series of novel cases and some statistical tests are performed with the aim to determine the shape of the probability density of novel cases in the future.


Introduction
The coronavirus disease emerged at the end of 2019 and since then it has become a global crisis. The very quick spreading of the disease made the study of its spreading of great importance for its control and prevention [1][2][3][4][5][6]. On the other hand, the mathematical modeling for the spreading of diseases has been performed since the works of Daniel Bernoulli in the 17th century for spreading of the small pox. Later, the Bernoulli model has been modified with the aim that unbounded growth does not occur and randomness effects are included in the model. At first, a factor was introduced that had the effect of making the variation rate of population growth N, negative, being, in this case that the model is given by dN/dt = −αN(1 − νN) (1 − ηN), where the parameters α and η are in the range α > 0 and 0 < η < ν. Moreover, taking some randomness in these parameters, we obtain a more realistic model to handle real situations [7][8][9][10][11][12]. In a general way, whether the size of the population at time t is N(t), being the relative rate of growth of the population at time t, we have the effective population in time t, N e (t), subject to some randomness due to the environmental effects: N e (t) = N(t) + ξ(t), where the first term N(t) is deterministic, while the second term ξ(t) is random (reflecting the environmental randomness effect).
The purpose of this paper is the epidemic modeling of new cases of coronavirus using the nonlinear stochastic differential equation approach with fractional white noise and Hurst parameter at range H ∈ (0, 1), thus showing that the fluctuations of officially released COVID data follow non-trivial self-similar and Gaussian statistics, as a fractional Gaussian driving noise would imply. We analyze the rises and falls of novel cases daily of COVID-19 reported by the public healthy agencies in countries as Brazil, where the low number tests performed in the population lead to a great uncertainty in the official data. Due to the possibility that the dynamics of novel cases might exhibit long-range correlations; thus, it may be described in terms of long-memory processes such as the fractional Brownian motion [13]. Since in many problems related to network traffic analysis, finances and many other fields, the processes under study seem empirically to display self-similar properties and the long-range dependent properties and since the fractional Brownian motions are the simplest processes of this kind, it is important to have a systematic study of these processes and to use them to construct other stochastic processes.
The use of standard stochastic differential equations has already been used in the literature to study the behavior of time series for price dynamics in refs. [14][15][16][17]. In addition, the use of standard Brownian motion that corresponds to the case of Hurt index H = 1/2 was employed recently for studying the spreading of COVID-19 in ref. [3]. The stochastic dynamics smoking with non-Gaussian noise was studied in refs. [18][19][20]. The dynamics of a stochastic COVID-19 epidemic model with jump-diffusion was then analyzed in ref. [18]. The stochastic susceptible-infected-recovered (SIR) dynamics expressed using an ordinary Itô-stochastic differential equation has been proposed in refs. [21][22][23][24][25][26][27]. The statistical analysis described by the master equation and transition rates for the infection process was made in ref. [28]. Moreover, the statistical inference in the epidemic model for the Ebola virus was performed in ref. [29]. The discrete-time epidemic model with binomial distribution has been used to study the transmission rate of COVID-19 in [30]. A statistical model for the health care impact of COVID-19 in India has been studied in [31]. The impact of mobility restriction in COVID-19 superspreading was analyzed in [32]. The effect of immunization through vaccination on the susceptible-infected-susceptible (SIS) epidemic spreading model was analyzed in [33]. Here, we propose the analysis based on the fractional Brownian motion to study the dynamics of novel cases. Moreover, we aim to prove that noise has long memory. The plan of this paper is as follows: in Section 2, we describe fractional white noise analysis as a model for dynamics of the spreading of disease. In Section 3, we present the numerical results for the model and the long range memory by rescaled range analysis (RS). In Section 4, we present our conclusions and final remarks.

Fractional White Noise Analysis
The fractional Brownian motion (BMF) is defined as a family of Gaussian processes, being introduced by Kolmogorov in ref. [34] and indexed by the Hurst parameter H into the range H ∈ (0, 1), being the first application made by Hurst in 1951 to model the long-term storage capacity of reservoirs along the Nile River [35,36]. The fractional Brownian motion was defined in refs. [37][38][39], where, for each H ∈ (0, 1), a real-valued Gaussian process for all s, t ∈ R + . If H = 1/2, the fractional Brownian motion is the standard Brownian motion or the Wiener process. We consider H restricted to range H ∈ (0, 1) and apply this equation as well as its modifier for the infected numbers by coronavirus N(t) in each day. Hence, we use the nonlinear fractional stochastic equation with a drift term in the form of a logistic model with a random term to perform the analysis of the spreading of novel cases of coronavirus N(t), given by where the drift term, f(t), is a better adjustment using a third order polynomial as f(t) = −30,827 +1177t − 8.0186t 2 + 0.02t 3 in the later considered cases obtained by least squares fit to the data. The add of the least square fit f(t) to the logistic model with noise is to take into account the isolation measures and slow vaccination of the populations of each country, which smooths the curve of the growing of novel cases. Due to this, the data are better fitted with a polynomial than fitting directly the solution of the model without noise.
where W (t) H is a fractional white noise defined by where {e n } ∞ n=1 is the orthonormal basis of L 2 φ (R) and The B H -integration theory of Equation (1) is based on using ordinary products path-wise and leads to an integral that, for integrands where ∆: . These integrals do not have an expectation zero and are called fractional path-wise integrals. Here, we consider the B H -integral considered in ref. [40], defined by For instance, suppose where We concentrate on the B H -integral given in [ where the definition here is an extension of the fractional Itô integral introduced in ref. [37]. Hence, we concentrate on the B H -integral defined by where denotes the Wick's product [37][38][39][40][41][42][43][44]. These integrals use the Wick's products and are known as the fractional Itô integrals with reference to the corresponding situation for standard Brownian motion [38,40]. We consider the white noise calculus based on B H (t), H ∈ (0, 1). Due to the behavior of novel cases, we consider the deterministic part adjusting to the daily novel cases using a suave curve. The aim is simply to provide a smoothed version of the time series using standard methods as the lowest regression. Furthermore, we aim to calculate the ensemble of all possible trajectories using the fractional Brownian motion as is made in Figure 1. The zigzag behavior in the range of large t values reflects in an increase of the uncertainty in the data and low-number of test performed in the population. For modeling of the zigzag behavior of novel cases, we add a random noise term in the logistic model for the spreading of an infectious number as given by Equation (1) in order to simulate the effect of rises and falls of novel cases [3]. The α, ν, and η parameters comes from the logistic model with a threshold. We use the values obtained by Bernoulli for smallpox α = ν = 0.13; however, we should have α(t) and ν(t). The range of η(t) values is into 0 < η(t) < ν(t). f(t) comes from the adjustment of least squares to the data into the period considered. The daily fluctuations with a weekly cycle of reported cases have been observed in many countries and are due to the diagnostic and data reporting practices [45]. We obtain the time series of the model Equation (1) using the values obtained by Bernoulli given by α = ν = 0.13, η = 0.063. Furthermore, we use a low intensity of fractional white noise β as β = 1.0 × 10 −6 .

Numerical Results
We write the BMF model that is a Gaussian process W H (t), t > 0 with zero mean and stationary increments and variance given by W 2 H (t) = t 2H . The sample path of the BMF model is a fractal curve with fractal dimension d = 1/H, where H = 1/2 corresponds to the standard Brownian motion whose increments ∆W H (t) = W H (t + 1) − W H (t) are statistically independent. For H = 1/2, the increments ∆W H (t) are known as fractional white noise, showing long-range correlation as ∆W correspond to random walk [3]. The intensity of the value of fractional white noise used in the calculations is such as β(t) = 1.0 × 10 −6 t 3 .
In Figure 2, we plot the numerical result of the half-width of the distribution of N(t) as a function of time t, σ(t). We calculate the variance of the distribution where the standard deviation gives an estimation of novel cases in each day t. The range of data considered here is from 14 March 2021 up to 21 July 2021. The difference to the real data is due to the approach used. We calculate the nth order moments µ n = (x − m 1 ) n until the fourth order about the mean or central moments, where we have the following relations: 2 . Normalized measures commonly used indicating a deviation from the Gaussian are the kurtosis λ 4 , defined as λ 4 = (µ 4 /σ 4 ) − 3 and the skewness, λ 3 . In Figure 3, we display that the behavior of λ 4 (t). λ 4 (t) is numerically calculated by solving Equation (1)). Moreover, the kurtosis relates to the deviation of the tail of the distribution as compared to Gaussian P(N(t), t) = 1 √ 4πβ(t) e −[N(t)] 2 /4β(t) , whose solution would correspond to Equation (1) for ν(t) = η(t) = 0 and f(t) ≡ 0. The range of negative values obtained for the kurtosis indicates that the shape of the distribution is closest to Wigner's distribution [11]. Furthermore, at range of small t considered, where the kurtosis is close to zero, we have that the distribution is closest to a Gaussian (λ 4 = 0).

Rescaled Range Analysis
There are several estimators for the Hurst exponent, such as rescaled range (RS) analysis and multifractal and detrended fluctuation analysis [46,47] (DMFA) and (DFA) that differ basically by the choice of the fluctuation measure. The obtained exponent by DMFA and DFA is similar to the Hurst exponent, except that both DMFA and DFA may also be applied to signals who have underlying statistics. In Figure 4, we present the Log-Log graphic with the objective of determining the Hurst exponent. The window of data considered for analysis were the last 120 days before 9 January 2021. We use the RS analysis, where the Hurst exponent is used as a measure of long-term memory of time series. It relates to the autocorrelations of the time series and the rate at which these decrease as the lag between pairs of values increases. The RS analysis is the oldest and best-known method to estimate the Hurst indexes. In general, the RS analysis is an estimator of the Hurst exponent for random walks; however, in this nonlinear logistic dynamic, the rescaled range is a reliable exponent. The start point of the method is to determine the standard deviation using the mean field approach as σ = N 2 − N 2 and the range R as The rescaled range series (RS) is given by where H is the Hurst exponent. We obtain the value of the Hurst exponent using the RS method as H = 0.4678 (9). The presented numerical results show the approximation of the Hurst parameter H close to H = 1/2, which is related to the classical Brownian motion and common distribution shape parameter as moments, kurtosis, and skewness. Although the value obtained for the data set is very close to 0.5, we should have H time dependent, H(t), that is, the estimator is relevant in the current context and may be time dependent. The value found for the data set is within the interval H ∈ (0, 0.5), thus indicating an anti-persistence for the time series of case numbers N(t) reported daily.

Conclusions
In brief, we analyzed the spreading of novel cases of coronavirus using the nonlinear stochastic differential equation with fractional white noise (nonlinear fractional Brownian motion approach) to model the rises and falls of novel cases of coronavirus in countries such as Brazil. The variability in the time series, for instance in Brazil, is mostly due to the case reports not being filed on weekends, leading to a large-amplitude oscillation within a period of seven days. Furthermore, the uncertainty in the results generated by the low number of tests performed in the populations and to underreporting generates a large uncertainly in the official results and, consequently, the addition of randomness in the differential equations becomes necessary for the growth of infected numbers. We find a small deviation in the Hurst indexes from H = 1/2 as H ≈ 0.47, indicating a small deviation from the standard Brownian motion approach. An investigation into whether fractional Gaussian noise is indeed the best (phenomenological) explanation for the fluctuations of COVID data can be made in a future work. The starting point would be more established schemes such as DFA, with as few free parameters as possible, and a much larger database, i.e., COVID data from many countries. However, we must be careful in using fractional, i.e., Gaussian noise with algebraic correlation, and further investigate in a data-driven manner what correlation structure the fluctuations display across different data sets.