Determining COVID-19 dynamics using Physics 1 Informed Neural Networks

Abstract

Definition 1. A feedforward neural network with a total of N neurons arranged in a single layer is a function y : R d → R of the form where t ∈ R d , α i , b i ∈ R. σ is the activation function, w i are weights for each neuron multiplied ρ(x,t) = −∂N (x,t) ∂x . (2) The conservation law states The relationship between the stated variables is: where v f = traffic free flow and ρ m = maximum traffic flow 125 The cost function (J D L) is used to improve the accuracy of the neural network; it is calculated 126 by obtaining the mean square error (MSE) of N number of outputs at point x at time t. ρ * (x,t) 127 is the neural network's prediction and ρ(x,t) is the genuine value. The implementation of the 128 physics informed neural network then additionally finds the MSE J P HY which is with regard to 129 the conservation of the stated conservation laws.
|ρ(x,t) − ρ * (x,t)| 2 (6) The physics informed neural network is then used to optimize the neural network and a param- . 146 The study thus introduces the use of physics informed neural network to implement a training 147 process dependent on both data and physics laws. The study uses a single machine infinite bus 148 (SMIB) system, which is a generator with only one generator. The parameters and variables in 149 the equation include the inertia constant m 1 , d 1 damping coefficient, B 12 is the entry of the bus 150 sustenance. P 1 is power generated by the generator, V 1 and V 2 are voltage magnitudes of buses 1 151 and 2, σ 1 , σ 2 represent the voltage angle behind reactance σ . is the angular frequency of generators.

152
Thus the resulting function is The common mathematical models however do not account for these in the predictions they make; making it hard to account for variables or elements such as over crowding, social distancing and other policies which may have been implemented by the different countries. The main policies highlighted by the authors include the use of police to enforce proper social distancing in traffic crossings, shops and other places. It also focuses on the shutdown of public transport, trains and airports. Thus to account for these multiple policies and have a better prediction the study uses real data. This study is conducted using data and estimations. The study also estimates the effective reproduction rate. The first model: Subjected to the initial conditions, S = S 0 , I = I 0 , R = R 0 and E = E 0 .
The second model used in the study accounts for quarantine control. The model thus introduces a time dependent variable T (t) = Q(t) × I(t). This also changes the effective reproduction rate The parameter Q(t) is also determined using a separate neural network which takes in the data of Time, Susceptible, Exposed, Infected and Recovered as input data. The model processes the data in a 2-layer network with 10 nodes per layer and uses a ReLu activation function (N N (W,U ) ). The determined Q(t) is then put in the Physics Informed Neural Network which uses the model below to make the approximations of the model.
Subjected to the initial conditions, S = S 0 , I = I 0 , R = R 0 and T = T 0 . Where β is the spreading rate, γ is the recovery rate and δ is the death rate [49].
The infected group can be removed to either assume a recovered population at the rate γ or deceased 191 population at the rate δ. This means δ is the death rate, β is the infection rate and γ is the recovery 192 rate. Figure 1 shows the resulting COVID-19 transmission SIRD flow diagram.

193
From the flow diagram in Figure 1 we obtain the system.
The model is subjected to the initial values, S = S 0 , I = I 0 , R = R 0 = 0 and D = D 0 = 0.

196
As an accuracy and optimization aid, studies and implementations of neural networks have show that the use of numbers less than 1 is better. Hence we need to rescale the given data to assume values between 0 and 1 through the non-dimentionalization process.
Subsisting in the SIRD model we obtain Hence the resulting system is.
The resulting neural network that we develop takes a single input value of time t. The input is 198 passed through the layers with weights W i,j where i is the position of the start node and j is the The representation of a neural network matrix with m layers and n nodes per layer.

The loss function 206
To optimize a neural network through back propagation, a loss function has to be first obtained.

207
For the PINNS we developed, we find the loss function loss T by obtaining the sum of two loss  To understand COVID-19 which has become a pandemic we need to determine the minimum rate 229 at which secondary infections should occur for a pandemic to occur. The reproduction number R 0 230 is also the rate which any spread below would stop the spread.The following is its derivation.

232
The sensitivity analysis of the mathematical model also provides some of the Key properties of .
Integrating we obtain To obtain the maximum value we find a point where the Equation 3.2.2 is equal to zero, and we 237 determine it occurs when S = β γ+δ . 238 Now we obtain the amount of people that we expect to eventually get infected. The fate of infected 239 individuals is that they either recover RR end or die D end hence we find the expected number of 240 people to either recover or die.
where S end = γ + δ β ln(S end ), Where S represents susceptibles, I represents Infected, R represents Recovered and D represents removed with the removal rate of (γ + δ). Such that we have, Which can be rewritten as,   The results of this model discussed above from a PINNS of 3 layers with 30 nodes per layer.    The graph in Figure 3 is the resulting graph of the data used for the artificial data for the early 261 training process. Figure 4 is a data fitting graph of the susceptible population and there is a good 262 fit which means there is a substantially small sized error. Figure 5 is the obtained results and the 263 graphs have a good fit which means they have a small error. The graph in Figure 6 is that of the 264 results of the recovered and it also has a good fit and less errors. Figure 7 is the resulting graph of 265 deceased, this graph has a good fit, but is less accurate compared to the other graphs.   Figure 8 is a resulting graph obtained using data fitting for the susceptible population and there 271 is a good fit which means there is minimal sized error. Figure 9 is the resulting graph of the data 272 fitting, it has a good fit which means they have a small error. The graph in Figure 10 is that of the 273 results of the recovered and it also has a good fit and less errors. Figure 11 is the resulting graph   of deceased, this graph has a good fit, however it has a bigger error compared to the other graphs. After using a small sized data set to train the model of 30% of the available data results were 277 obtained. Figure 12 is a resulting graph obtained only for data fitting purposes for the susceptible 278 population and there is a good fit which, hence a small sized error. Figure 13 is the obtained graph 279 of the data fitting, it has a good fit which means they have a small error. The graph in Figure 14 280 is that of the results of the recovered and it also has a good fit and less errors. Figure 15 is the 281 resulting graph of deceased, this graph has a good fit, however it has a bigger error compared to 282 the other graphs. The overall outcome shows that as much as the fitting has less errors, they are 283 more larger compared to the cases where bigger size data was used. The simulation were conducted with 5,00,000 iterations and 4 layers and each layer having 30 287 nodes. The size of the data set used was 576 which was the maximum days data available at the 288 time.
289 Figure 16 is the resulting graph for data fitting purposes for the susceptible population and there is 290 a small sized error and a good fitting. Figure 17 is the obtained graph of the data fitting, it has a 291 good fit which means they have a small error. The graph in Figure 18 is that of the results of the 292 recovered and it also has a good fit and less errors. Figure 19 is the resulting graph of deceased, 293 this graph has a good fit, however it has a bigger error compared to the other graphs. The overall 294 outcome shows that as much as the fitting has less errors, they are more larger compared to the 295 cases where bigger size data was used.         shows that an increase in the number of layers also reduces the error. That means that when the 324 number of layer and iterations are increased much better accuracy is achieved. number of nodes has limited effects. Table 3 shows that an increase in the number of hidden layers 328 reduces the margin error. It also shows that an increase in the size of data also reduces the margin 329 of error.This shows that an increase in both the data size and number of iterations will reduce the 330 margin of error.

331
Table 4 results shows that as the number of nodes per layer is increased the margin of error is 332 reduced. The results also show that the as the number of iterations are increased the error also 333 reduces. Table 5 shows that as the number of iterations increases the error is reduced. It also 334 shows that as the error is random smaller iterations but as the iterations increase the larger data 335 set achieve less errors. Table 6 shows that as the data increase per number of nodes the error is 336 reduced, however as the number of nodes are increased the size of the error increases.

361
The obtained results conclude that the model is well suited to make predictions of values within 362 the training period. This thus means that the model is well suited in data fitting, where there are 363 some days where data was not collected,was wrongly inserted or was lost. The other benefit of 364 the framework is that it returns the spreading rate, death rate and recovery rate which were setup 365 to serve as the adjusting variables for the PINNS part of the neural network. One limit however 366 is that increasing the number of forecasted days leads to diminishing accuracy. This also occurs 367 since the predictions are linked to the spread rate, death rate and recovery rate patterns which are 368 determined using the old available data. The model does also predict the wave format displayed 369 by the active infections. This concludes that the model is highly suitable for making short-term 370 predictions, but for long-term purposes predictions can be made with a sustainable margin of error.

371
One big limitation faced during the research was the shortage of data and process power. Thus for 372 further research we recommend the development of a model which will group each of the SIRD 373 populations using age, since it has been shown that different age groups are affected by the disease 374 differently. Using the well segmented data each of the parameters or rates can then be defined as 375 per age group. We also recommend that a model similar to the one used in the study be tested only 376 at a larger scale using higher processing power.