SIM-D: An Agent-Based Simulator for Modeling Contagion in Population

: The spread of infectious diseases such as COVID-19, ﬂu inﬂuenza, malaria, dengue, mumps, and rubella in a population is a big threat to public health. The infectious diseases spread from one person to another person through close contact. Without proper planning, an infectious disease can become an epidemic and can result in large human and ﬁnancial losses. To better respond to the spread of infectious disease and take measures for its control, the public health authorities need models and simulations to study the spread of such diseases. In this paper, an agent-based simulation engine is presented that models the spread of infectious diseases in the population. The simulation takes as an input the human-to-human interactions, population dynamics, disease transmissibility and disease states and shows the spread of disease over time. The simulation engine supports non-pharmaceutical interventions and shows its impact on the disease spread across locations. A unique feature of this tool is that it is generic; therefore, it can simulate a wide variety of infectious disease models (SIR), susceptible-infectious-susceptible (SIS) and susceptible-infectious (SI). The proposed simulation engine will help the policy-makers and public health authorities study the behavior of disease spreading; thus, allowing for better planning.


Introduction
Infectious diseases such as COVID-19, flu influenza, malaria, dengue, measles, mumps, and Ebola, etc. are a big threat to public health safety and security [1,2]. Every year, a new strain of epidemic diseases circulates around the globe and infects millions of people, resulting in thousands (and sometimes millions) of deaths. According to the 2019 report of the World Health Organization (WHO), the health systems across the world are weak and are not prepared for epidemic-like situations [3]. This means that the human lives across the globe are at an increased risk. This observation was validated by the recent wave of COVID-19, a pneumonia-like disease caused by the coronavirus. As of this writing, the disease has infected 40 million people and has killed over one million persons [4]. Every place and community was affected. The disease has spread to over 200 countries in a matter of two-four months [5].
The world is at a greater risk of epidemic spread, for several reasons. First, the world population is growing rapidly, which means that the number of people that can be infected is increasing. Second, many people are moving to cities for better life opportunities, thus increasing the population

Disease Model Parameters
In this section, the modeling parameters of a networked susceptible exposed infectious recovered (SEIR) diseases are discussed. This model can be used to describe SIR, SIS and SI disease models. The disease states in the SEIR model are used in describing the progression of disease in the individuals and its transmission in the population [13] as shown in Figure 1.
The three most important parameters are transmissibility, infectious period and incubation period. The transmissibility (ρ) controls the diffusion intensity of the disease in population. It is the probability of transmission of disease from an infectious individual to a susceptible individual in one minute of contact [14].
The incubation period (∆t E ), also known as the latent period, is the interval during which an infected individual cannot transmit the disease to other susceptible individuals. The infectious period (∆t I ) is the period during which an infected individual can transmit the disease to other susceptible individuals. Table 1 lists the dwell times in different disease states for typical influenza [15,16].
The four states of a basic Susceptible-Exposed-Infectious-Recovered (SEIR) disease model. (∆t E ), (∆t I ), and (∆t R ) show the dwell times in exposed, infectious, and recovered states, respectively. The model can be used to describe (SIR), susceptible-infectious-susceptible (SIS) and susceptible-infectious (SI) disease models. Based on the dwell times in exposed, infectious and recovered states, the diseases can be modelled using SIR, SIS and SI disease models as shown in Table 2. In SIR (susceptible→exposed→infectious→recovered), a person after recovering stays in the recovered state forever. The individual develops immunity to that strain of the disease and will not become susceptible again [17]. Typical influenza is usually modelled using SIR [13]. In SIS (susceptible→exposed→infectious→susceptible), a person after getting recovered becomes immediately susceptible to the disease again. The Measles, a well-known infectious disease, and most of the sexually transmitted diseases conform to SIS [18,19]. In SIR (susceptible→exposed→infectious), a person after becoming infectious remains infectious forever. The human immunodeficiency virus (HIV/AIDS), usually transmitted through bodily fluids, is modelled using SI [20].
The progression of disease within an agent is probabilistic and can be shown using a finite state machine as shown in Figure 2. The probability is dependent on whether the user is vaccinated or not. A person vaccinated against the disease is less likely to be infected compared to someone who is not vaccinated. The finite state machine (FSM) in Figure 2 shows the SIR model. For SI, the infectious period is changed to infinite. Moreover, for SIS, a transition is added from recovered to an uninfected state.   [21].

Related Work
The modelling of infectious disease is important for getting insights into its spread [21]. Two popular approaches have been tried for modelling the spread of infectious diseases: deterministic models and agent-based stochastic models [22,23]. A brief overview of each one is them is as follows:

Deterministic Models or Compartment Models
In deterministic or compartmental mathematical models, the population is divided into groups based on the state of the disease (i.e., susceptible, infected, and recovered people). Then, the model is applied to it, which determines the final count of infections and the duration of the epidemic [24][25][26]. The model assumes uniform contacts within the population. The model is not able to give the time-varying information about the infections. EpiModel is a package for mathematical modeling of infectious disease over social networks. It models SI, SIR, and SIS epidemics, with and without demography [27]. It also models deterministic compartmental models, stochastic individual contact models, and stochastic network models. A study conducted by Ko Kwok and his team analyzed the deterministic models for modeling of infectious disease in early state of disease outbreak [28]. The study concludes that mathematical models can help inform policy makers by evaluating the effectiveness of different existing intervention approaches in the early phase of epidemics.

Agent-Based Stochastic Models
Stochastic models are probabilistic models and the transmission of the disease depends on the probability of transmission as well as on the other agents (and their demographics). In an agent-based modeling approach, the infection occurs only by contact with infected individuals within the society [12,29]. These models give time-varying details about the spread of diseases, such as the number of infected people each day and their locations. This is a realistic approach for observing the spread of diseases and understanding its patterns [30].
Some of the most popular agent-based epidemic simulation models include EpiFast [31], EpiSims [13] and EpiSimdemics [11,32,33]. The models were good at modelling the epidemic in shared and distributed environments. AceMod, an agent-based modelling framework for studying influenza epidemics, was able to analyse the spatiotemporal spread of contagion and influenza spatial synchrony across the population [34]. Longini developed a similar simulation engine and its parallel implementation, but the agents in social networks were not real but surrogates [35].
Several other models have been developed for epidemic modelling. GLEAMviz is a desktop application that provides a simple, intuitive, and visual way to set up simulations, develop disease models and evaluate simulation results using a variety of maps, charts, and data analysis tools [36]. The dengueMe simulation engine can simulate the following cases. Proposed by Nishiura, the first case study describes a simple dengue transmission model based on ordinary differential equations [37]. The second is an Aegypti population dynamics model, which is also based on ordinary differential equations that were proposed by Lana et al. [38]. The second example includes simulating the application of insecticide in some areas of real urban space (geographic database). The third is an agent-based transmission model based on Medeiros et al. (2011) [39] and simulated in the same real urban space.
The Spatiotemporal Epidemiological Modeler (STEM) is free software available through the Eclipse Foundation [40]. Originally developed for research, the tool is designed to help scientists and public health officials to create and use spatial and temporal models of emerging infectious diseases. These models can aid in understanding and potentially preventing the spread of diseases. EpiFire [41][42][43] is an application programming interface and graphical user interface implemented in C++, which includes a fast and efficient library for generating, analyzing, and manipulating networks. Network-based percolation and chain-binomial simulations of susceptible-infected-recovered disease transmission, as well as traditional non-network simulations, can be performed using EpiFire.

The SIM-D Algorithm
The SIM-D algorithm is based on information diffusion across a social interaction network, as shown in Algorithm 1. It performs interactions between the persons (also known as agents) according to the interaction graph shown in Figure 3. A brief description of SIM-D algorithm is summarized and given below and summarized in Algorithm 1.

Algorithm 1:
The general SIM-D algorithm, where P represents a Person, I ij refers to an interaction between Person i and j and PT ij refers to the outcome of their interaction. N refers to the total number of simulation time-steps and K refers to the total number of persons.
Input : Interaction Network of Agents, disease parameters (transmissibility, state duration) Output : Infections at locations over time-steps, variation of people in disease states initialization(); // load interaction graph and disease state for (time-step = 1 to N) do for (i = 1 to K) do // for each person prepareInteractions(); // prepare list of interactions to perform foreach (I ij ∈ P i ) do // for each interaction of two persons (i and j) if (i < j) then ComputeInteractionOutcome(I ij ); // determine probability of transmission end if (PT ij > 0.5) then PM ij → sendOutcomes(); end end P i ← receiveOutcomes(); // receive outcomes for P i end UpdateState(); // update person status and infections at locations end  (1).
where r is transmissibility, s is susceptibility, and t is the duration of contact. If the computed probability is greater than 0.5, the interaction results in the transmission of disease; otherwise, it is not. 3. If a person gets infected, it is notified of the infection. 4. At the end of time-step, the persons with outcomes update their state. The number of infections at locations are also updated.

Implementation
The SIM-D algorithm was implemented in C++ language using the concept of object-oriented programming (OOP). The persons were created using C++ objects interacting with each other objects through messages (function calls). A flowchart of the SIM-D is shown in Figure 4. The flowchart will be improved according to the selection of the disease.
At the start of the simulation, the manager object first calls the input reader to initialize the Person, Location and Disease State objects. The person-person interaction graph is loaded into Person objects and the disease parameters are loaded into Disease State Object. The manager then starts the simulation time-step with the processing of person objects. As shown in Figure 5, each iteration has five basic steps.

•
Prepare Interactions-The person objects calculate the interactions that it has to perform with other persons. The interactions are computed based on the input interaction graph, its status (infected or susceptible), its vaccination status and isolation criteria.

Evaluation
In this section, the ability of SIM-D at simulating the spread of disease in a population is demonstrated. It is of interest in showing the spread of disease under normal conditions and to show the effect of different interventions. For each study, SIM-D was evaluated at showing the daily infections and the total number of infections. The variation of persons (agents) in states (susceptible, exposed, infectious and recovered) was also shown over the course of the simulation.

Experimental Setup
The experiments were performed on a quad-core machine with core i7 processors and 16GB of memory. The simulation was run over population networks of the Peshawar region of Pakistan and was run for 90 time-steps (also called the simulation days). The simulation was run for SIR and SIS models with different transmissibility numbers. The transmissibility, infectious period duration, and incubation period duration are chosen to mimic the spread of flu influenza [15].

SIR Infection Diffusion in Population
To show the spread of a disease that follows SIR model, a simulation was run over a population network of 100,000 agents. The disease transmissibility was set at 0.036. At the start of the simulation, 500 people were randomly marked as infected. The simulations are run without additional interventions and have an attack rate (total number of people infected in the simulation) of about 70%. Figure 6 shows that as the simulation progresses through steps, more people get infected, reducing the number of susceptible persons. The epidemic peaks around the 40th iteration and then starts falling as more people enter the recovered state. During the course of a simulation, 70% of people got infected. The reason for the high number of total infections is the higher transmissibility of disease and longer times in the infectious state (set at 11 simulation days). It would be interesting to compare the numbers from SIM-D against the actual spread of infections. However, unfortunately, daily stats about disease spread during the season (Nov-Feb) are not available for Peshawar. A rough comparison can be done against the number of flu influenza cases reported in Khyber Pakhtunkhwa (KP) [44,45] as Peshawar is part of KP, therefore naturally possessing similar demographics. SIM-D was able to report comparative numbers (70,000 in comparison to 55,000 actual cases). One major reason for the difference is that the cases are often not reported and hence do not get recorded.

SIS Infection Diffusion in Population
To show the ability of SIM-D in modeling diseases, we ran a simulation for simulating the flu influenza over a population network of 100,000 agents. The disease transmissibility was set at 0.040. As is the case with flu influenza, we set the stay in infectious state at 7 simulation days. The incubation duration was set at 1 simulation day, which is a typical latent period for an infected person (with flu influenza) before he starts infecting others. At the start of the simulation, 2000 people were randomly marked as infected. Figure 7 shows that as the simulation progresses through steps, more people get infected, while the number of susceptible people sharply reduces. The epidemic peaks around the 40th iteration. It is interesting to note that at any time there were no more than than 10% population infected at one time. However, over the course of the simulation, 80% of people got infected. The reason for a high number of total infections is the higher transmissibility of disease. Moreover, no intervention measures were applied, which leaves a major portion of population available for getting infected.
Technically, SIS allows re-infections, however, we do not get reinfections in this study. The reason for this is because in flu influenza, there is a small chance that a person might get infected in 90 days. To mimic this behavior, a recovery period of 90 days is applied. This means that the re-infection could happen after the person becomes susceptible again.

Effects of Transmissibility
Transmissibility, also known as the attack rate, is the probability of transmission of disease in one minute of contact. In general, the increase in transmissibility results in faster transmission of disease from the infectious to susceptible individual. An experiment was performed using the SIS disease model to show this impact. In the experiment, the infectious period and incubation period were kept fixed at four iteration days and one iteration day, respectively, and vary transmissibility in the range of 0.0001-0.002. The infectious period of four was chosen because, at this infectious period, the compute time is between maximum and minimum. An incubation period of one was chosen to keep the number of people in the incubation state to a minimum. This way, the majority of the population stays in susceptible and infectious states and shows the maximum effect of transmissibility. Figure 8 shows that as the transmissibility was increased from 0.0001 to 0.002, the number of total infection sharply increases. There are two reasons for this sharp increase. First, higher transmissibility has higher impact on the spread of disease, as can be seen in Equation (1). Therefore, infectious diseases such as COVID-19 (the first disease X is caused by a highly transmissible acute respiratory syndrome coronavirus) tend to spread faster compared to diseases with lower transmissibility such as HIV (HIV viral load and transmissibility of HIV infection-undetectable equals untransmittable).

Effects of Interventions
To study the progression of disease in the population, the government and health authorities are often interested in the impact of different interventions. Fortunately, SIM-D is capable of modelling the non-pharmaceutical interventions as well. To see the impact of interventions on the disease progression, an experiment was designed. In the experiment, the SIR model was simulated, and two types of interventions were applied to it: isolation and vaccination. In the isolation, the activities of individuals are limited to homes and only 10% of the time are they allowed to perform interactions outside the house. In the vaccination, 50% of the population is randomly vaccinated. A vaccinated person is nine times less susceptible to disease than a healthier person. Figure 9 shows that when 50% of the population is vaccinated, the progression of disease slows down significantly. The peak is also delayed by 10 days. Delay in peak is important as it gives the health authorities more time to respond to the outbreak. Figure 9 also shows the impact of isolation or lockdown on the disease progression. When a lockdown is applied around the 23rd iteration, the number of infections start falling. The disease picks up again slightly around the 50 th iteration, when the lockdown was eased (40% of outside the house interactions were allowed). Figure 9. Effect of interventions on the spread of disease. In no intervention case, the attack rate is 35%. In the isolation case, the attack rate is 18%. In the vaccination case, the attack rate is 13%.

Conclusions and Future Works
The recent wave of COVID-19 has shown that epidemic diseases could spread very quickly and incur huge human and financial losses. The healthcare systems across the globe were not prepared for its fast spread. The isolation or lockdown measures enforced by different countries were not very effective either, as they were not backed by proper simulation and modeling tools. Therefore, it is very important to have an epidemic modeling tool that will help before the epidemic starts and during epidemics get insights into the spread of infectious diseases.
Our simulation tool SIM-D tries to fill this gap by simulating the spread of disease in population. SIM-D is an agent-based model that computes interactions between agents and shows infections at locations over simulation time-steps. It effectively models the realistic behavioral phenomena of persons in a society; this means that the transmission of infectious disease could only occur if an infectious person meets with another susceptible person.
SIM-D is very helpful for the policy-makers and health authorities. It is generic and can simulate the infectious diseases belonging to SIR, SIS and SI disease models. Additionally, it has the capability to show the effect of non-pharmaceutical interventions on the spread of disease.
As part of future work, we would like to extend the functionality of SIM-D to all contagions, such as the spread of fear, information, habits and rumors. We would also like to develop parallel algorithms of SIM-D to increase its performance capacity to simulate larger data-sets and more complex interventions. An addition that covers mutations of diseases (viruses) will be a valuable addition and we plan to do that as well. A post simulation analysis on the statistics (i.e., age-groups and localities effected) would give us more insights and should be considered as a future work. Furthermore, the current version does not support the pharmaceutical interventions and it would be a good addition in future work. A GUI interface to the software will make it easier to use for the end-users, helping them perform quick analysis without understanding the complexities of the system.
In future, we would like to use more personal demographics like age, gender, social status, literacy, economic condition etc. to model the person in a better way. Equations that compute the individual and collective effects of these demographic will help in further improving the SIM-D model.

Conflicts of Interest:
The authors declare that they have no conflict of interest.