Modeling the Spread of COVID-19 in Enclosed Spaces

: SEIR models are typically conjured for populations in open environments; however, there seems to be a lack of these types of models that deal with infection rates amongst enclosed spaces. We have also seen certain age groups struggle to deal with COVID-19 more than others, and to this end, we have constructed an age-structured SEIR model that incorporates the Gammaitoni–Nucci model, which is used for infective material in an enclosed space with ventilation. We apply some sensitivity analysis to better understand which parameters have the biggest impact on overall infection rates, as well as create a realistic scenario in which we apply our model to see the comparison in sickness rates amongst four different age groups with different ventilation ﬁltration systems (UVGI, HEPA) and differing quanta production rates.


Introduction
As we enter another phase of what, as of 22 February 2021, was already one of the top ten deadliest diseases in human history [1], it is imperative humanity has all the tools necessary to prevent its most vulnerable, the suffering that many have already had to endure. COVID-19, the novel coronavirus that struck its first victim in Wuhan, China, on 17 November 2019 [2], has exacted a devastating toll throughout both years 2020 and 2021. Currently, the WHO statistics record over 192 million cases of COVID-19, and over four million deaths worldwide, as can be seen in Figure A1 [3].
There has been widespread understanding that some communities have had more dire consequences with regard to COVID-19 than other communities [4][5][6]. Unfortunately, there are so many of these marginalized groups that there is no way to analyze them all in the scope of this study. This leads us to look for which group might be the most vulnerable, which brought us to Figure A2 [3]. We can see that deaths resulting from COVID-19 are heavily skewed towards the elderly and that those aged 65 and older account for 79.3% of the COVID-19 deaths in the USA [7], while only accounting for 16.5% of the population [8]. When compared to the death proportion overall (including COVID-19) of 73.9% for the 65+ group, there is reason to believe that the elderly are more adversely affected than other age groups.
By assuming that different age groups are affected with varying degrees of severity [6,9], it is our goal to ferret out how different these adversities are when applied to situations where we know the elderly are in more compromising positions; specifically in places, such as nursing homes and hospitals and other enclosed spaces where, especially, many elderly are cared for. First, we must quantify the spread of transmission in an enclosed space, for which,we will use the idea proposed by Gammaitoni and Nucci [10].

Gammaitoni and Nucci Model
Though we will be referencing the work performed by Gammaitoni and Nucci (G-N model), it is of great importance to briefly examine the governing equation that lead to the success of their model.

Wells-Riley Equation
It was the seminal work of Wells [11] and later, its extension by Riley [12], that led to the model by Gammaitoni and Nucci (G-N Model) [10]. Wells developed a method of estimating how much contagion is expelled into the air based on the size of the evaporated droplet and came up with the "quantum" theory for infections. Wells defined a "quantum" of infection as "the number of infectious droplet nuclei required to infect 1 − 1 e of susceptible persons" [10,11]. This is from the fact that there is a relationship between the dosage of infection and the initial response that reasonably follows a Poisson distribution. Later, Riley et al. [12] extended the concept and produced the Wells-Riley equation, namely where we have that C is the number of new cases, S is the number of susceptible, I is the number of infected, A is the ventilation rate in air changes per hour (ACH), V is the volume of the occupied space measured in meters (m 3 ), p is the pulmonary ventilation rate of the susceptibles m 3 h , and finally, q is the amount of quanta produced per hour by an infector.

G-N Extension
The challenge then became to discover how the amount of quanta in the air relates to the infection rate of individuals in an enclosed area and how we can best mitigate the risks of being in a closed area, if possible at all. Gammaitoni and Nucci came up with a clever set of differential equations that are built completely from the Wells-Riley equation but would relate quanta to infection rate. The equations are where we have Q as the total amount of quanta in the space. Using these equations, Gammaitoni and Nucci would test preventable countermeasures against outbreaks occurring in areas assumed to be previously infection-free, such as operating rooms and sterile hospital rooms. Since our purpose lies in resolving enclosed situations such as this, we can use this as a base to understand how COVID-19 might spread amongst a susceptible population. However, the G-N model is limited in scope to understanding only how quanta and infections relate, and one cannot truly get a complete picture of the dynamics of an outbreak using solely this model.

SEIR Model
To overcome the limitations of the G-N model, we can look to using the well-known SEIR model that has been a mainstay for decades in studying infectious disease dynamics. The SIR model, also known as the Susceptible-Infected-Removed model, is the basis for the SEIR model. In the SIR model, there are three linked categories, the number of susceptible, S, the number of infected, I, and the number of people removed through death or immunity, R. The interactions between these categories are described by the differential equations following [13,14] dR dt = γI (5) and N = S + I + R. Here, N is the total population, β is the infection transmission rate between susceptible and infected people, and γ is the recovery rate which is the reciprocal of average days required to become removed [15]. We can then normalize the variables as: the basic reproduction number, is defined to be the expected number of infections that a singular infector will produce amongst a totally susceptible population. In the literature, authors will often refer to these as generations of infection, where the first infector will be generation zero, or generation one, depending on the author. Here we will refer to the first generation as zero.
Substituting back into our original differential equation gives us the normalized versions and s + i + r = 1.

Incubation Period
The basic SIR model lacks realism, in that it does not take into account the transition time between becoming infected and having the ability to infect others. We can solve this problem by adding an "(E)xposed" state that is a link between the susceptible and infected states, the incubation period here being the time it takes the average individual to start showing symptoms. We can then extend the SIR model with this consideration by adding in an "exposed" state before they become an infector. This gives us the SEIR model where now S + E + I + R = N. Normalizing yields s = S N , e = E N , i = I N , and r = R N to give us with s + e + i + r = 1. The newly minted parameter α is the rate of progression from the exposed to the infected state. This is the reciprocal of the number of days it takes on average to develop symptoms. The basic reproduction number R 0 does not change with the addition of an exposed state, as none of the infectors nor the infection is lost, only translated further out in time.

Combining Gammaitoni-Nucci and SEIR
As stated previously, to overcome the weaknesses in each of the SEIR and G-N models we will combine them to gather information on population dynamics in an enclosed space. To do this we notice that Equation (1) from the G-N model and Equation (3) from the SIR model are similar, and under the assumption that ventilation and quanta generation are constant. Because the time scale differences between susceptibles and quanta levels from equations of the G-N model work on such different time scales, as quanta generation is performed on small time scales and sickness on a much larger time scale, for practical purposes, we can assume that quanta levels are in a pseudo-steady-state. Thus, we will allow dQ dt = 0, we can find Q = qI A and can express Equation (1) as We can then combine and normalize, as performed previously, to obtain the first combined model This model with its transition rates can be visualized in Figure 1.

Multi-Age SIR Model
One can expand the use of the SIR model to compare sickness rates of different groups. In [16], such an expansion was performed to handle different age groups with different infectious classes (symptomatic and asymptomatic, denoted by I s i and I a i , respectively). The expanded model is given by where It is the assumption that the recovery rates are the same for both infectious classes. β is the transmission rate on contact, η is the transmission modifier for asymptomatics, which is the rate at which asymptomatic infectors will spread the disease as a proportion of symptomatic spread rate. There are many instances, where due to lack of symptoms, a contagion is less likely to transmit from an asymptomatic person. Finally, k is the proportion of infectors that become asymptomatic [16].

General Contact Matrix
In Equations (21)-(23), we also find a C ij term, which is the contact matrix for the system. The contact matrices are the average number of contacts per day by someone in category i interacting with someone in category j, where we have that the person who is in class j being designated as asymptomatic or symptomatic [16]. In this model, since there are two different classes of contact matrices, one for symptomatic and one for asymptomatic, then they must be equated in some way. To do this we chose to let w be the proportion of contacts made by a symptomatic infected person compared to an asymptomatic person. We assume that if someone is sick with symptoms that they will try and reduce their contacts, thus we will say that C s ij = wC a ij , where 0 ≤ w ≤ 1. Thus, Equations (21)-(23) will change such that and then we will just let C a ij = C ij to finally obtain

Combining G-N and Age-Structured SEIR
Now that we have the combined Gammaitoni-Nucci and SEIR models, we can combine this with the multi-age model to better see how each age group gets affected during a disease outbreak. We first notice that in the multi-age model Equations (21)-(23), we have the term which is the sum of all the different age groups contacting each other and spreading the disease. This is effectively the combined number of infectors, or the I in the original SIR model. If we use this idea, then we can say that this is the number of people that have transitioned from the susceptible to the infected stage. We can then interject an intermediate stage, as is done in the SEIR model, which more realistically approximates the situation that we model as with any droplet diseases that take time before symptoms show. Thus, if we say that first, the susceptibles have to go through the exposed phase E, then we can combine these two models to obtain where k is the proportion of those that become asymptomatic infectors, α is the exposure time or the reciprocal of the time it takes on average to develop symptoms, γ is the reciprocal of the average recovery time in days, and i = 1, 2, 3 . . . are the age groups. Since we are under the assumption that the rate at which quanta is generated is matched by the pseudo-steady-state ventilation, then we can also assume that β can be substituted for pq VA , as was done in the SIR and G-N combination. Thus, we have the overall combined multi-age ventilated space SEIR model. Normalizing then gives us which we will use in our later computations. The flowchart for the final model with rates of transition is given in Figure 2.

Next Generation Matrix
Oftentimes the most important question to ask is how an infection can propagate amongst the populace. This information paves the way for researchers to find ways to combat the spread itself, and is summed up in the basic reproduction number R 0 that we briefly touched on earlier. The R 0 from the normalized SIR model is a dimensionless value found when using values of di dt at time t = 0. Thus, when we have Nβsi − γi > 0 an epidemic situation occurs, as more people are becoming infected, and when Nβsi − γi < 0 an epidemic is stemmed, as infections are being removed. We can reduce this equation to Nβ γ = R 0 as we have assumed at t = 0 the entire population is susceptible and hence s = 1. Thus, R 0 < 1 means an epidemic is stemmed, and R 0 > 1 means we have an epidemic situation. For our model, we have more than one age group and thus discovering R 0 is made trickier but is still just an extension of what was performed above. The main difference is that there are now several infected classes, and they may have different rates of contact between them, as well as infections. Our goal should then be to average the expected number of new infections over all of the infected states. The end result of these calculations will become a square matrix called the Next Generation Matrix (NGM). In an NGM H, the entry h ij is equal to the R 0 between the completely susceptible group i, and the infected person in group j. Creating H is not extremely hard, and there are recipes to construct it [17]. However, the key is to break them down into smaller parts so that they can then be combined to create the full matrix.

Transmission and Transition Matrices
Extensive work has been performed in the field of creating NGMs. Specifically, Diekmann et al. [17] go into great depth about how to construct one of these matrices and how to use them for many different biological purposes. For our work we will let our NGM be H, as above, and we will construct H following the guidelines in the stated paper. From Diekmann's previous work, we know, by definition, that the dominant eigenvalue of H will be the basic reproduction number R 0 ; thus, it will be our goal to first construct H.
To create H we must first have a linearized system of ordinary differential equations, and thus we must make an assumption regarding Equation (31). If we assume that we start at the disease-free steady-state, where we have s i = m i where m i is the proportion of people in the group i, then we can simplify the equations down to the linear system which are going to be the basis of our transmission and transition matrices. To create the matrices, we will want the transmission matrix T to correspond to all the cases where an infected births a new infection. The transition matrix M will then correspond to all the state changes from the exposed state to one of the infected states or from the infected to the removed state. Now, we have three infected categories and four age groups, and we know that each age group will have all three infected categories; thus, for our purposes, we know that both T and M will be 12 × 12 matrices. Then, given the indices i and j to mean age groups i and j, T is constructed as T ij being the rate at which people in age group j infect people in age group i. Now for condensation purposes, let z 1 = −ηBN and z 2 = wBN. This gives rise to the matrix T (see Appendix B Transmission matrix). Similarly, we have the transition matrix M (see Appendix B Transition matrix) being constructed using the rate at which exposed people in group j progress to an infected group i. Now, we can fit these two matrices into the form given in [17] where X is the transpose of e i , i a i , i s i , a 12 × 1 vector. Using the software Matlab we then calculated H = −TM −1 , which is the NGM we were hoping to obtain. We can then reduce to a 4 × 4 matrix using the recipe from [17]. If we call this new matrix K, then (k − 1)T 1,5 + kT 1,9 (k − 1)T 1,6 + kT 1,10 (k − 1)T 1,7 + kT 1,11 (k − 1)T 1,8 + kT 1,12 (k − 1)T 2,5 + kT 2,9 (k − 1)T 2,6 + kT 2,10 (k − 1)T 2,7 + kT 2,11 (k − 1)T 2,8 + kT 2,12 (k − 1)T 3,5 + kT 3,9 (k − 1)T 3,6 + kT 3,10 (k − 1)T 3,7 + kT 3,11 (k − 1)T 3,8 + kT 3,12 (k − 1)T 4,5 + kT 4,9 (k − 1)T 4,6 + kT 4,10 (k − 1)T 4,7 + kT 4,11 (k − 1)T 4,8 + kT 4,12     then we find ρ(K) = R 0 , where the ρ(K) is the spectral radius of K [17]. However, determining the exact eigenvalues is a tedious endeavor and thus we will use the software Matlab to do the calculation for us.

Results
Since we are specifically discussing COVID-19, we will make appropriate assumptions on some of the parameters used in the model that will be static throughout all of the trials. As [18] estimated, we will be using a pulmonary ventilation rate of 0.48 m 3 /h. We will also be assuming that people in the exposed category have no risk of spreading the disease, and the average person will have an incubation period of 5 days, thus α = 1/5 [19]. As various sources report a recovery time of one to two weeks [20][21][22], we will then assume that the average infectious period is 10 days, and hence the recovery rate is γ = 1 10 . We have from [23] that at least 50% of COVID-19 spread was due to asymptomatic individuals, hence k = 0.5.
We will also be assuming that the air will be completely mixed immediately upon having quanta enter it, and that the population will stay static throughout the entire trial. There will also be the quantity A, the number of air changes per hour. This is defined as the rate at which particles are removed from the space and is expressed in air changes per hour (ACH), which is the ratio of the volume of air being pumped into the room over an hour and the volume of the room. As per [10] we will be using A = 6 for standard ventilation, A = 18 for the addition of a HEPA filter on top of general ventilation, and A = 45 for the addition of a UVGI filter on top of general ventilation. We will assume that all of these systems are properly installed and that they will continuously provide the same amount of ACH. In this manuscript, our goal is to better understand how ventilation affects a population that is constantly subjected to it, and thus we will be keeping a static population for the trials below.

Creating C ij for Proposed Model
Our model heavily relies on the data garnered by Mossong et al. [24]; the key to our work was that the diaries kept a record of ages. The data was split into 15 categories ranging from 0 to 4 years of age to 70+ years of age for each country. We then averaged each of the categories contacts over all of the countries which can be seen in Table A1 for the 15 age groups as we are trying to eliminate as many biases as possible coming from the culture and other factors. We then paired the number of age categories from 15 to 4 [ Table A2], consisting of ages 0-19, 20-39, 40-59, and 60+, which we will call D ij . Finally, since the data itself is not closed, we apply a reciprocity condition to ensure that there are equally as many interactions between i and j as there are between j and i. This is given below This four-category table is what we use throughout the findings as our contact matrix C ij . In our model, it is not required to have only four age groups, as with the right data set to create a contact matrix, any number of age groups can be made to fit the data on hand. However, for the sake of simplicity, we have chosen to only survey the four age group range.

Parameter Sensitivity Analysis
As the main goal of our research is to better understand how to mitigate viral spread based on the proposed model, we first want to see the effect filtration has on the basic reproduction number for differing quanta production rates q. In [18], the estimated values for q were 15-128 per hour for influenza, depending on how q was calculated, and 1-10 per hour for rhinovirus. Owing to the fact that data for quanta production is scarce and mostly related to only a few papers [10,15,18] that have discussed their values. We have chosen the ancestral strain to have q = 10 for our model, and based on [25], we can approximate the delta variant as having q = 25 when we are considering the best-case scenario. We provide a contour and table of important values below.
From Figure 3, we can see that filtration has a strong effect on the R0; however, only with a UVGI filter do we have a chance at containing an outbreak, and even then, only for roughly q = 20. Through this, we can see that unless the production amount of quanta is low, there will need to be other strong measures taken to stop the spread of COVID-19 in an enclosed space, and that general filtration corresponding to A = 6 is almost never going to be enough to contain the spread. These findings would suggest that most, if not all, enclosed spaces should have some kind of ventilation system to neutralize some of the infectious material that is being expelled from infected individuals.
Next, we want to inspect what happens as the parameters are shifted for w the symptomatic contact rate and η the asymptomatic infection rate. Figure 4 shows us that both infection rates for asymptomatics and the contact percentage for symptomatics affect the system in roughly the same way, and thus provides evidence to show how much of an effect stopping the spread of symptoms are, and that reducing the number of contacts that an infected individual has with susceptible individuals can play a large role in containing an outbreak.
Finally, we wanted to test how sensitive our matrix is to changes in the number of people per group against R 0 and infectivity rate of the eldest group, which is done below. For the data below, we paired down the four age groups into three age groups, so that we can obtain a better understanding of how having extreme proportions in the age groups will affect our overall model.
We achieved the results in Figures 5 and 6 by allowing two different age group proportions to vary; we have that m 1 is the proportion of the youngest age group, and m 3 is the proportion of our oldest age group. We then analyze the overall sickness rate of the eldest age group by looking at the total number of infected in that group ( Figure 5). This allows us to better understand how one might mitigate interactions between age groups to help contain the spread of COVID-19. We can see in (Figure 6) that the change in R 0 due to having differing amounts of age groups is negligible. However, we can see that our model does have sensitivity towards contacts, as evidenced by Figure 5. We can see that the proportion of the eldest group that gets sick increases almost linearly with the change in having more younger people. As the eldest group had the least contacts, this leads us to determine that contacts play more than a negligible role in determining overall sickness rates, though still overall, it is a minor change in terms of overall population sickness. This means that, according to our model, one of the best ways to combat the eldest age group sickness is by limiting their contact with younger groups of people that are more likely to interact with more people each day.

Stability Analysis
As our system is closed and has no births, we do not have an endemic point; thus the only place to look for stability is around the infection-free steady-state. As we are using an expanded SIER model with the NGM recipe, Diekmann et al. [17] inform us that R 0 controls the stability of the disease-free steady-state.

Elderly Care Facility
Our example will take place in an elderly care facility. Suppose there are 109 residents of the facility, in which if we say that the average room size is 20 square meters with 3 m ceilings, we then have an average of 60 m 3 per room. Then multiply this by 109 residents, and we obtain 6540 m 3 . If we allow an extra 3460 m 3 for things such as dining rooms, kitchens, and any other miscellaneous rooms, then we estimate V = 10,000 m 3 of space in the facility. We will also assume that there will be roughly two out of three people in the facility will be elderly residents, and the other one-third will be in-care attendants, family, and other visiting guests. Thus, we will say there are 66 other people in the facility that are not residents, which will give us a total population of N = 175. From this, we will assume that all of the residents that reside there are in the 60+ age category. A fairly arbitrary age breakdown of the facility (and what we will be using) is: 4% aged 0-19, 20% aged 20-39, 13% aged 40-59, and 63% aged 60+. All graphs below will pertain to these age ranges. We justify these age breakdowns as a scenario before lockdown, and thus there will be those of all age groups within the building.

Initial Infector and Contact Rate
We have arbitrarily chosen one initial asymptomatic infector in the 20-39 age range to find the results below, though this could easily be changed for different data trials. We will also arbitrarily assume that the asymptomatic contact rate is four times the symptomatic contact rate, hence w = 0.25. We will also be using, from [23], those who never develop symptoms still spread the disease at 75% of the rate symptomatic individuals would, thus η = 0.75 for the proportion of infected that become symptomatic and the infection rate of those asymptomatic, respectively. Below we will see the overall infection rate amongst all of the age groups for the idealized delta variant and both HEPA and UVGI filtration systems.
We can see that, in general according to our model, the elderly are the least impacted proportionally to the other groups. This is most likely due to the lower contact rate that the elderly have in general with each other and other age groups. We believe that based on [25], if we allow q = 10 to be the best-case scenario for the ancestral strain of COVID-19 and delta has a little over twice the R 0 that the ancestral strain does, then q = 25 should affect R 0 in a similar way. Viewed in this light, Figure 7 shows us that HEPA filtration would not stop between roughly 60% and 100% of susceptibles will become infected, and from Figure 8 we see that even with UVGI filtration, an outbreak will still occur with the delta variant.

Conclusions
Our mathematical model indicates what we saw in the news headlines in real life, in that when applied to a nursing home-like setting, the outcome, even with the best filtration systems, is unlikely to be enough to prevent the spread of COVID-19, whether it be the ancestral strain or the delta strain. We can see that the youngest amongst the groups was always the quickest to get the infection, and we believe that is due to the Mossong data showing that younger people contact others more often throughout their day. We also found that according to our model, the amount of quanta has a stronger impact on the overall health of the population compared to adding filtration systems to airflow, and that without, what we consider an idealized scenario (ancestral strain have q = 10), there are still many complications that can arise in a small enclosed setting, as can be seen in Figure 3. Our model indicates that the infection rate for asymptomatic infectors and the rate at which symptomatic infectors restrict their contacts are roughly equal in terms of their impact on the overall sickness rate, as seen in Figure 4. Finally, we found that our model is sensitive to the amount of contacts an age group has. By shifting the proportions of the youngest and eldest groups, we found that we can alter the overall number of infected by several percentage points, as seen in Figure 5.

Further Research
Pertinent further research would be to look into how a specialized model could be used to determine whether being closer or further from ventilation would help in any meaningful way. We also believe it would be useful to examine what would happen if we were to add vaccine information, as well as other uses of personal protective equipment (PPE), such as masks. In terms of real-world differences, we believe that because this is the first stage of this model, there are several dynamics that could be added to increase the robustness of the model overall. We realize that it is not comprehensive to have the population static; however, for simplicity, we decided to keep our population constant to better understand the dynamics at play in terms of only those that are inside of a ventilated space. The next step of this research will be to allow for the mixing of the ventilated population with the non-ventilated population to see how outside sources impact the sickness rate rather than just the residents of a ventilated zone.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this paper: