On a Coupled Time-Dependent SIR Models Fitting with New York and New-Jersey States COVID-19 Data

This article describes a simple Susceptible Infected Recovered (SIR) model fitting with COVID-19 data for the month of March 2020 in New York (NY) state. The model is a classical SIR, but is non-autonomous; the rate of susceptible people becoming infected is adjusted over time in order to fit the available data. The death rate is also secondarily adjusted. Our fitting is made under the assumption that due to limiting number of tests, a large part of the infected population has not been tested positive. In the last part, we extend the model to take into account the daily fluxes between New Jersey (NJ) and NY states and fit the data for both states. Our simple model fits the available data, and illustrates typical dynamics of the disease: exponential increase, apex and decrease. The model highlights a decrease in the transmission rate over the period which gives a quantitative illustration about how lockdown policies reduce the spread of the pandemic. The coupled model with NY and NJ states shows a wave in NJ following the NY wave, illustrating the mechanism of spread from one attractive hot spot to its neighbor.


Introduction and model
From a scientific perspective, the COVID-19 pandemic has highlighted the crucial role of mathematical and statistical models in providing guidance for health policies.Expressions such as "flatten the curve", the "apex", the "plateau" have been widely heard in medias and employed by decision makers to explain their choices regarding rules and policies during this critical period.In this short article, we first introduce a simple SIR model, in which we adjust a key parameter k standing for a control on the Susceptible-Infected rate, and secondarily the death rate, in order to fit the data of the pandemic in NY state in March 2020, and provide predictions for a near future.Then, we add a node in the model to take into account the daily fluxes between NY and NJ states.Note that these two close states, are, up to the day of redaction of this article, the more severe hit by the pandemic in United States.Of course, the coupling may be extended to other states.However, in this article, we restrict ourselves to NY and NJ.Accordingly, the main key points of this article are that, 1) it highlights the dynamics and epidemiological characteristics which have been discussed in press and health policies; It highlights qualitatively how lockdown policies have decreased the spread of the virus and provides prediction and explanation of an upcoming apex, 2) it fits real data provided for the New York state and 3) it fits the data of NJ state by considering coupled equations taking into account the daily fluxes between NY and NJ.This provides a quantitative visualization of how the virus may spread from an attractive hot spot (New York City in NY state) towards close states trough the daily fluxes of commuters.
We especially focus on fitting the total number of cases tested positive for COVID-19 as well as the number of deaths in both NY and NJ states.We also give insights in prediction of the number of people needing hospitalization in NY state.
SIR models are very classic in literature.For some reader's convenience, we mention here some contextual elements and references.The simplest classical SIR model is the KermackMcKendrick (KMcK) model which goes back to 1927, see [12,20].It writes In this original (KMcK) model, the population splits into three classes.The class S stands for susceptible, who can catch the disease, I stands for infective, who have the disease and can transmit it and R stands for the removed, namely, those who have or have had the disease but not transmit it anymore.Note that our terminology is slightly different as explained below.In (1), the dynamics follow the scheme S → I → R with respective transfer rates between classes of k and r.
SIR type models and more generally, mathematical models of epidemics have in fact a significant history.We refer to [20] for a textbook on these models and a brief history of epidemics.Models have become more sophisticated and may include more compartments such as exposed, infective asymptomatic, infective with symptoms, and also reservoirs such as bats, and include stochastic dynamics.Recently, SIR type models have been widely used in the context of the COVID-19 pandemic, to model the spread all around the world, see for example [11,15,9,13,16,17,22] and reference therein cited.Here are also some examples of references for SIR models in other epidemic diseases: Dengue [?], Chikungunya [18,19] and Ebola [23].See also references therein cited.
In the present article, we first consider the following model This simple model has classically three classes: S for susceptible, I for infected and R for recovered.Specifically, the class I is intended to represent all the people who bear effectively the virus at a given time, and can transmit it if in contact with other people.It includes all infected people with or without symptoms, reported or not.There are some differences with equation ( 1).First, it includes a death rate d(t).And even tough, the number of deaths does not appear explicitly as a variable, it is simply given by the integral t 0 d(u)I(u)du.Also note that in expression the rate of contamination from S to I is proportional to the proportion of susceptible (S) in the whole population (S + I + R).This is a classical expression standing for the fact that the probability for each individual in the I class to spread the virus among the class S is proportional to the portion of S in the whole population, see for example [10,24] and references therein cited.This rate is corrected by a crucial coefficient k(t) which is intended to fit the real transfer rate and which contains the effects inherent to the properties of the virus (for example change of propagation rate due to genetic mutation of the virus) or to specific policies (like quarantine, social distancing, lockdown...).This time dependence allows to adjust the dynamics to fit the data.This is a specificity of our model and turns it into a non-autonomous equation.This time dependence of k is obviously relevant in our model since the rate of transfer from S to R is the main target of health policies and is subsequently subject to vary over time.Secondarily, we also allow the death rate to vary.Many internal or external factors may affect the death rate among which are concomitant lethal disease, temperature, hospital conditions...More significantly, one has to note that the rate transfers considered here are instantaneous transfer between compartments, and the function d(t) is different from the Case Fatality Rate (CFR).Recall that the CFR is the death rate per confirmed case over a given period of time, and is a typical indicator for death rate.In South Korea, the country which led the highest number of tests, it has been reported to be of 1 percent, see [14].In China as of February 2020, this rate varied between 3.8 in the region of Wuhan and 0.7 in others regions, see [1] and also [8].Using tables 1 and 2 gives a CFR in NY from March 1 to April 1 of 1941 83804 2.3 per cent.Since our model fits the data, by definition, it fits the CFR.At the end of the epidemic, if the whole number of people that contracted the virus would be tested positive the CFR would provide the probability to die for an individual who catches the virus.However, during a growing phase such as the month of March considered here, the CFR has large variations.Moreover, there is a delay between the time a person is tested positive and the time he dies.The time between symptom onset and death has been reported to range from about 2 to 8 weeks, see [8].The typical average being 23 days according to [14].One could for example introduce a time-integral death expression t 0 d(u, t)I(u)du to take into account these informations.However, the time-window considered here is short and corresponds to the beginning of reported cases in NY.Furthermore, in this short article, we wanted to focus on a simple model able to fit data, highlight relevant dynamics and provide estimations.Since, a person in the I compartment, will either recover or die, above remarks on the function d hold for the coefficient r.In the present work, for sake of simplicity, we have set the r coefficient to the constant value 0.64.This is a simplification which is classical in SIR models.Note that setting the coefficient r to a constant value is equivalent to assume that people recovering between times t − 1 and t, which is given by R(t) − R(t − 1) would also write r t t−1 I(u)du according to equations (2).A more meaningful expression for R(t) − R(t − 1) would be t 0 r(u, t)I(u)du standing to the fact that people recovering between time t − 1 and t have been infected from a period ranging from 0 to t, with a transfer rate given by r(u, t).This would lead to the equation r = t 0 r(u,t)I(u)du t t−1 I(u)du .Once r is set to a constant value, to fit data over a reasonable period, numerical tries provide a unique choice for constants k and d.It was still possible to use different values of r to fit the data.However, different values of r result in different dynamics over time, and notably different time for apex.Our choice of r = 0.64 was made to provide dynamics that seem relevant to us regrading the timely dynamics beyond the data.In particular, too smaller of values for r would provide dynamics with a late apex and less relevant regarding the timely effects of the disease.Other studies have considered models with a constant r, with different values.See for example [21] and references therein cited.Remarkably, varying r, and make it depend on time, provides some freedom to later fit data over a longer period of time, taking in account the end of the first wave in NY.
Upon the above discussions, our strategy is rather simple: set r to a constant value.Then choose a constant k which fits the data of positive cases during a period of time.Next, choose a constant d to fit deaths data for the same period of time.Note that since the parameter k has much more effect than the small parameter d, this procedure is possible and efficient.After, repeat the procedure over a subsequent period.The overall procedure results in a constant function r and two piecewise constant functions k(t) and d(t).For our model and the given data, the procedure was efficient allowing to provide these choices by successive tries.Note that this could be done automatically by cooking an algorithm to set the parameters following these guidelines.Note finally that our assumptions do not a include birth rate for the susceptible population but rather focus on the short time effects of the disease.
2 Numerical Simulations, Data and Dynamics

Fitting the total number of infected people and the number of deaths
Data from total infected people and total number of deaths in New York state is available at the New York times web site, see [3].We have downloaded the data from there from March 1 to April 1, which makes 32 days.We have reported the number of total cases in table 1 and the number of deaths in table 2. For numerical simulations of equation ( 2), the parameter r was set to 0.64.Then, in order to fit the data of the total number of infected people, we chose: and to fit the total number of deaths, we chose: Initial conditions were set to (19453556, 5, 0), 19453556 being the number of people living in NY state in 2019 according to USA Census bureau.Regarding reported data in tables 1 and 2, as a matter of fact, not all infected people in March 2020 were tested.To take into account this fact in our model, we adjusted the parameters to fit the quantity 0.2 × (I + R) from the model, with the total number of infected people (data in table 1) minus the total number of deaths (data in table 2), i.e. we assume here that only 20 percent of living people that have contracted the virus has been tested positive.Note that shortly after the first submission of this work, NY state reported the result of a random antibody test over a sample of 3000 individuals.Statewide, 13.9 percent of people were tested positive, ranging form 21.2 percent in New York City (NYC), to 3.6 percent in upstate counties.See [6,5].Comparing with data from [3], this gives a ratio of approximately 10; only about 10 percent of people having caught the virus in NY would have been tested positive.
Statistical estimates of related ratio, depending on the number of estimates provide a range of [5,3000], see [2].Simulation of system (2), and comparison with data resulting from tables 1 and 2 are plotted in  In Figure 1-b, we plotted the total number of deaths from the model and compared it to the data.In Figure 1-c, we have plotted four curves corresponding to various I(t) for different simulations of (2).
• The curve I1(t) in red corresponds to the simulation of (2) with k(t) = k1 = 1.057 and d(t) = d1 = 0.0016 for all time.

Fitting the total number of people at hospital
Next, we have retrieved data corresponding to the number of people being at the hospital between March 16 and April 1.During this period, and after, NY state officially reported daily useful charts and statistics, on the local spread.We have then computed an estimation of people being effectively at hospital at a given date.We denote H(ti), ti ∈ {16, 17, ..., 32} the total number of hospitalizations at a given date.In order to fit the solution of (2) with this data, we performed a linear regression between (I(ti)), ti ∈ {16, 17, ..., 32} and (H(ti)), ti ∈ {19, 20, ..., 32}.The coefficients a and b of the linear regression were determined by the least-square method, leading to the formulas: 1.057 if t < 21; 0.9 otherwise 0.0016 if t < 21; 0.00232 otherwise    2) and statistical methods.Panel (a) illustrates an approximation of I(t i ), t i ∈ {16, 17, ..., 32} by a vector (a 1 H(t i ) + b 1 ), t i ∈ {16, 17, ..., 23} and another vector (a 2 H(t i ) + b 2 ), t i ∈ {24, 25, ..., 32} where a j and b j , j ∈ {1, 2} are the coefficients obtained thanks to the least-square method.Panel (b) then provides a prediction of people in need of hospitalization by plotting the quantity H(t) = I(t)−b2 a2 if I(t) ≥ 41475.76, The result is plotted in Figure 2 Then, a prevision of number of people needing hospitalization can be provided by the formula: The result is plotted in Figure 2-b.

Dynamics
Summing up the equations in (2) and looking for stationary solutions yield the following theorem.
Theorem 1.We assume that initial condition of system (2) satisfies S(0) > 0, I(0) > 0 and R(0) = 0. Then for t > 0, all variables remain bounded and positive: there exists a positive constant M < R(0)+I(0) such that for all t > 0: Non-negative stationary solutions of the system are given by ( S, 0, R), with S ≥ 0 and R ≥ 0. Furthermore, S(t) is decreasing, R(t) is increasing and the variation of I(t) is given by the sign of Remark 1.It is worth noting that this theorem, which proof is relatively straightforward provides two simple but relevant interpretations from the applicative point of view.The first thing is that the stationary solutions are given by ( S, 0, R), with S ≥ 0 and R ≥ 0. This means that the stationary solutions, are all with 0 infected but may take arbitrary non-negative values (within a bounded interval) for susceptible and recovered.This reflects the reality of the disappearance of the virus.The second thing we want to mention is that the variation of I is given by the sign of expression (5).In particular, basically during a classical wave, this sign will be positive before the apex and negative after.
for node 2, where functions +cij stand for the densities of population coming from mode j and going into node i.In the remaining of the manuscript we assume that these functions cij do not depend on S, I and R and drop the superscripts.Furthermore, we assume c12 and c21 to be periodic of period 1 (one day), with a Gaussian profile and respective apex at 8:30 am and 6:30 pm.Not that the amplitude c21 is multiplied by a coefficient greater than 1. in comparison with c12 to take in account the amount of population int NY and NJ.We want to emphasize here the attractivity of NYC, and therefore assume that the fluxes are mainly from NJ to NY in the morning and form NY to NJ in the evening.The function l(t) integrates policies of lockdown: after March 23, in the model, the daily fluxes are divided by 1000.Therefore, the function l(t) is a piecewise constant function given by: Initial conditions were set to (19453556, 5, 0), for node 1 and (8882190, 0, 0), for node 2. Note there are no infected cases at initial time in NJ.This means that in our model, initial spread in NJ follows from infection in NY.The same methods as in section 2 were used to fit the data for both NY and NJ for the network model (6).Illustrations are provided in Figure 3.It shows how the model fits the data.In Figure 3-a, we have plotted the quantity 0.2(I2 + R2) as a function of time in red.Recall that, in the model I2 + R2, represents the number of people in the population which has been infected by the virus and are still alive.The blue dots correspond to the data retrieved from [3] and plots the total number of infected minus the number of total deaths.Analogously, in Figure 3-b we have plotted the quantity t 0 d2(u)I2(u)du as a function of time, in red.The blue dots correspond to the data retrieved from [3].Finally, Figure 3-c illustrates I1t) and I2(t), which represent respectively the infected population in NY and NJ.It shows how the curve of NJ follows the curve in NY, with a small attenuation.An analog theoretical result as in section 2 holds for solutions of (6).Theorem 2. We assume that initial condition of system (6) satisfies S1(0) > 0, I1(0) > 0, R2(0) = 0, S2(0) > 0, I2(0) = 0 and R1(0) = 0. Then for t > 0, all variables remain bounded and positive: there exists a positive constant M < R(0) + I(0) such that for all t > 0: For any non negative S1(0), R1(0), S2(0), R2(0), the following functions satisfying for t > 0 I1(t) = I2(t) = 0 S1(t) + S2(t) = S1(0) + S2(0) R1(t) + R2(t) = R1(0) + R2(0) S1t = −S2t = l(t) c12(t)S2 − c21(t)S1 R1t = −R2t = l(t) c12(t)R2 − c21(t)R1 are solutions of (6) Remark 2. As in remark 1, we want to point out some interpretation of this theorem in the context of the pandemic.Here, we have described some solutions with free epidemic component (I ( t) = I2(t) = 0).Note that in this case however, Si and Ri are not constant since they vary according to the fluxes between the two nodes.Again the last inequality, quantifies the idea that when the spread of the virus becomes lower than the death and recovery the infected population starts to decrease.The difference here is that we take into account the two nodes together.

Conclusion
In this article, we have considered a simple non-autonomous SIR model to fit the data of COVID-19 in New York state.The model illustrates and quantifies how acting on the control k(t) allows to flatten the curve of infected people over the time.From the model, using classical statistical methods, it is then possible to provide predictions of the number of people in needs of hospitalization.Lastly, we have fitted data from NJ state thanks to a coupled SIR model taking into account the daily fluxes between NJ and NY.It allows to predict similar dynamics in NY and NJ, with a delay and small attenuation.Note that, despite its simplicity, our model fits, as good as more sophisticated models, the available data during the growing phase of March 2020.In a forthcoming work, we aim to fit data with the model over a longer period .This would provide an accurate estimation of the different parameters.

Figure 1 -
Figure1-a.In Figure1-b, we plotted the total number of deaths from the model and compared it to the data.In Figure1-c, we have plotted four curves corresponding to various I(t) for different simulations of (2).

Figure 1 :
Figure 1: This figure illustrates the simulation of system (2) and how it fits the data.In (a), we have plotted the quantity 0.2 × (I + R) as a function of time in red.The blue dots correspond to the data retrieved from [3].Analogously, in (b), we have plotted the quantity t 0 d(s)I(s)ds, which represents the total number of deaths according with the model, as a function of time in red.The blue dots correspond to the data retrieved from [3].In (c), we have illustrated the quantity I(t) corresponding to different values of k(t), d(t): the curve I 1 (t) in red corresponds to the simulation of (2) with k(t) = k 1 and d(t) = d 1 for all time.The curve I 2 (t) in green corresponds to the simulation of (2) with k(t) = k 1 and d(t) = d 1 for 0 ≤ t ≤ 21, and k(t) = k 2 and d(t) = d 2 for t ≥ 21.The curve I 3 (t) in pink corresponds to the simulation of (2) with k(t) as given in (3), i.e. k(t) = k i and d(t) = d i , t ∈ [t i−1 , t i ), i ∈ {1, ..., 4} with t 0 = 0, t 1 = 21, t 2 = 24, t 3 = 27, t 4 = 32.It illustrates how the health policies flatten the curve.In (d), we have again plotted the solution I(t) for k(t) as in (3), for a longer period.

Figure 3 :
Figure 3: This figure illustrates the simulation of system(6) and how it fits the data.In (a), we have plotted the quantity 0.2(I 2 + R 2 ) as a function of time in red.Recall that, in the model I 2 + R 2 , represents the number of people in the population which has been infected by the virus and are still alive.The blue dots correspond to the data retrieved from[3] and plots the total number of infected minus the number of total deaths.Analogously, in (b) we have plotted the quantity t 0 d2(u)I 2 (u)du) as a function of time in red.The blue dots correspond to the total number of deaths in NJ according with data retrieved from[3].Panel (c) illustrates I 1 (t) and I 2 (t), which represent respectively the infected in NY and NJ.

Table 1 :
[3]al number of cases reported in NY state from march 1 to April 1.See[3].

Table 2 :
[3]al number of deaths reported in NY state from march 1 to April 1.See[3].