EpiInfer: A Non-Markovian Method and System to Forecast Infection Rates in Epidemics

Kascelan, Jovan; Yang, Ruoxi; Shasha, Dennis

doi:10.3390/a18070450

Open AccessArticle

EpiInfer: A Non-Markovian Method and System to Forecast Infection Rates in Epidemics

by

Jovan Kascelan

¹

,

Ruoxi Yang

² and

Dennis Shasha

^3,*

¹

Department of Computer Science, New York University Abu Dhabi, Abu Dhabi P.O. Box 129188, United Arab Emirates

²

Department of Biostatistics, School of Global Public Health, New York University, New York, NY 10012, USA

³

Computer Science Department, Courant Institute of Mathematical Sciences, New York University, New York, NY 10012, USA

^*

Author to whom correspondence should be addressed.

Algorithms 2025, 18(7), 450; https://doi.org/10.3390/a18070450

Submission received: 14 June 2025 / Revised: 13 July 2025 / Accepted: 15 July 2025 / Published: 21 July 2025

(This article belongs to the Collection Feature Papers in Algorithms and Mathematical Models for Computer-Assisted Diagnostic Systems)

Download

Browse Figures

Versions Notes

Abstract

Consider an evolving epidemic in which each person is either (S) susceptible and healthy; (E) exposed, contagious but asymptomatic; (I) infected, symptomatic, and quarantined; or (R) recovered, healthy, and susceptible. The inference problem, given (i) who is showing symptoms (I) and who is not (S, E, R) and (ii) the distribution of meetings among people each day, is to predict the number of infected people (state I) in future days (e.g., 1 through 20 days out into the future) for the purpose of planning resources (e.g., needles, medicine, staffing) and policy responses (e.g., masking). Each prediction horizon has different uses. For example, staffing may require forecasts of only a few days, while logistics (i.e., which supplies to order) may require a two- or three-week horizon. Our algorithm and system EpiInfer is a non-Markovian approach to forecasting infection rates. It is non-Markovian because it looks at infection rates over the past several days in order to make predictions about the future. In addition, it makes use of the following information: (i) the distribution of the number of meetings per person and (ii) the transition probabilities between states and uses those estimates to forecast future infection rates. In both simulated and real data, EpiInfer performs better than the standard (in epidemiology) differential equation approaches as well as general-purpose neural network approaches. Compared to ARIMA, EpiInfer is better starting with 6-day forecasts, while ARIMA is better for shorter forecast horizons. In fact, our operational recommendation would be to use ARIMA (1,1,1) for short predictions (5 days or less) and then EpiInfer thereafter. Doing so would reduce relative Root Mean Squared Error (RMSE) over any state of the art method by up to a factor of 4. Predictions of this accuracy could be useful for people, supply, and policy planning.

Keywords:

epidemic modeling; compartmental models; SEIRS; forecasting; contact networks; parameter estimation

1. Introduction

Kermark et al. [1] introduced compartmental epidemiological models such as the Susceptible–Infected–Recovered model in which an agent is either Susceptible (healthy) or Infected (symptomatic and contagious) or Recovered (healthy and with partial immunity). By adding the Exposed compartment (asymptomatic and contagious), this model can be extended to the SEIRS model which characterizes most epidemics in which people can be contagious but not symptomatic. Understanding the evolution of an epidemic requires knowing the transition probabilities between these states, a process called calibration. The relationship between the states and parameters can be seen in Figure 1.

Various surveys have compared calibration methods. For instance, the survey conducted by Gupta et al. [2] used EpiPolicy [3] to compare these methods through simulations. The EpiPolicy simulator allows users to include features such as sub-population groups, facilities, population mobility, vaccination, hospitalization, mask-wearing, border closures, lockdowns, and school closures. The survey concluded that the best algorithm depends on the type of model and data, but there are certain algorithms such as Levenberg–Marquardt (least squares) that tend to perform well in many different cases.

The overall process of estimating infection rates can be broken down into two tasks: (i) inferring transition parameters between states and then (ii) using those parameter values to infer infection rates.

Some methods for both tasks make use of contact tracing information which gives how many people each person meets and whom they meet. Because such information is rarely available in quantity, EpiInfer only makes use of the distribution (and even only the mean of the distribution) of how many people each person meets (but not whom).

This paper presents and evaluates a new algorithm called EpiInfer as well as some variants. The paper’s contributions can be divided into a few parts:

Our algorithm EpiInfer consists of two components EpiInfer-core and an outer-loop algorithm ContinuousCalibrate that iterates over various hyperparameter settings in calling EpiInfer-core.
The non-Markovian algorithm EpiInfer-core projects future infection rates starting at any day d by considering (i) the number of newly infected patients in a few days before d, (ii) some information about the meeting distribution of people, and (iii) transition probabilities between the Susceptible (S), Exposed (E), and Infected (I) states.
Because the transition probabilities are not known at the beginning of an epidemic, ContinuousCalibrate recomputes the transition probabilities at each day by calling EpiInfer-core on previous data and various parameter settings. This dynamic setting of parameters follows the spirit of [4].
We show how using multiple locales can help ContinuousCalibrate improve the estimation of transition parameters and thereby improve infection prediction, especially early in an epidemic.
Though EpiInfer is most accurate when given a full meeting distribution, we show that using just the mean number of meetings per person is often enough.
Our experiments show that the two-component EpiInfer predicts better than (Markovian) Differential Equation models and LSTMs (as well as other neural network models) regardless of the forecast window. ARIMA is more accurate than EpiInfer for short-term predictions, but EpiInfer is more accurate than ARIMA starting at 7 days.

2. Related Work

We divide the related work into the two tasks of calibration and prediction of infection.

2.1. Calibration

Algorithms for the calibration of differential equation models can be separated into two major groups: gradient-based methods and gradient-free methods. Here, we review the major algorithms from each group.

2.1.1. Gradient-Based

Least-squares minimization using the Trust Region Reflective method [5] optimizes parameters by constraining updates to stay within a predefined distance (the “trust region”) from the current best guess. It computes the Jacobian matrix and determines the size of the trust region. The parameters are updated iteratively so that the objective function is minimized.

Truncated Newton [6] computes a partial Hessian matrix to reduce computational complexity. It uses this truncated matrix to update the search direction, and a line search to determine the step size along this direction. BFGS [7] is an algorithm that avoids directly computing the actual Hessian matrix of the objective function by iteratively approximating it. After each iteration and update to the approximation of the matrix, this new information is used to determine the step size and the search direction. There is also a limited memory version of this method that stores only a portion of past information to approximate the Hessian matrix [2]. The Levenberg–Marquardt [4] algorithm iteratively updates the parameters of a model. It works by reducing the difference between the predicted data of the compartmental model based on the current parameter estimation and the real data across all compartments available in the current epidemic dataset that is being considered. It starts with an initial guess of the parameters and then updates them using gradient descent.

2.1.2. Gradient-Free

Powell’s method [8] finds the minimum of a function without calculating derivatives. This method iteratively updates search vectors until there are no significant improvements.

Differential Evolution [9] is a heuristic for global optimization over discontinuous spaces or when gradients are unreliable. This method proposes a set of candidate solutions and modifies them discretely, then does recombination in the spirit of genetic algorithms.

Adaptive Memory Programming (AMP) for Global Optimization [10] is an algorithm for global optimization on non-convex problems. It works by maintaining a set of the best solutions found so far, which is called the memory. The search begins with a randomly generated population of solutions that are evaluated with an objective function, and the best ones are chosen to form the initial memory. After every iteration, a new population is created by combining the current population with the memory. The new solutions that are the best according to the objective function (e.g., the estimate of population in each compartment) are added to the memory, and they replace the worst solutions from the memory.

The basin-hopping algorithm [11] tries to find a global minimum even though there are many local minima. It starts from an initial point and then moves randomly to nearby points using an approach close to simulated annealing [12]. After that, it starts looking for nearby points that are lower cost and moving to them. At times, it makes a random move for exploration purposes. This keeps happening until there are not any lower points or a certain stopping condition occurs.

Dual Annealing [13] combines simulated annealing with genetic algorithms. It uses two optimizations at the same time, one conducted locally and one globally. Local optimization uses standard optimization techniques. The global optimization is carried out by moving from a state to one of its neighboring states, where each transition could happen with a certain probability.

2.2. Models That Use Information About Meetings

The above approaches try to calibrate transition likelihoods knowing how many people are in each compartment, but do not use information about the interaction among people. Because information about person-to-person interactions is often available, e.g., through surveys, several methods use that information.

Predicting infection trends can benefit greatly from knowing about the interaction among people for human epidemics but, in general, for agents (e.g., for animal epidemics). The best case from the point of view of analysis occurs when full meeting information is known (who meets whom), such as in contact tracing-based scenarios, but such data greatly infringes on privacy. When only meeting survey information is available (how many people do each person meet), meeting distribution-based models make sense.

2.2.1. Agent-Based Models

Agent-based models trace each interaction between individual agents (e.g., people for human epidemics). Hunter et al. [14] propose a hybrid model combining agent-based simulations and equation-based methods. They showed that the agent-based models helped to make predictions at low infection levels, while the equation-level predictions were computationally more efficient for high infection levels.

Eubank et al. [15] also employ an agent-based approach, utilizing dynamic bipartite graphs generated from urban traffic simulations based on real-world mobility data, to model the spread of infectious diseases. Their findings suggest that human contact networks show small-world properties, because people interact in their own friend groups.

Venkatramanan et al. [16] use a hybrid modeling approach that combines agent-based modeling with Bayesian calibration and simulation optimization (Nelder–Mead) to study epidemics. Bayesian calibration uses response surfaces to estimate parameter distributions and align simulations with observed epidemic data, while Nelder–Mead optimization minimizes the error between simulated and observed curves. Their study shows that data-driven agent-based models can forecast infectious disease outbreaks effectively by integrating diverse data sources.

2.2.2. Contact Tracing-Based Models

Contact tracing-based models offer detailed person-to-person analysis by identifying who interacted with whom during each encounter. Muntoni et al. [17] combine agent-based modeling to simulate epidemic dynamics with probabilistic contact tracing with Belief Propagation and Simple Mean Field algorithms to identify superspreaders and reconstruct transmission pathways. They find that probabilistic contact tracing methods improve epidemic containment by efficiently detecting superspreaders, enhancing backward and multi-step tracing, and reducing social and diagnostic costs compared to traditional contact tracing approaches.

Kim et al. [18] propose a multi-hop contact tracing strategy, which identifies and monitors direct contacts of an infected individual but also traces secondary, tertiary, and further levels of contacts within the transmission network. By using the Microscopic Markov Chain Approach (MMCA), the model simulates disease transmission dynamics and evaluates the effectiveness of multi-hop tracing in controlling outbreaks. Their study shows that multi-hop contact tracing enhances epidemic containment, and their proposed mathematical framework enables faster and more scalable computation compared to traditional Monte Carlo simulations.

Tuschhoff and Kennedy [19] rely on contact tracing data to detect and estimate heterogeneity in susceptibility. Likelihood ratio tests are used to evaluate the fit of competing susceptibility models, and maximum likelihood estimation (MLE) is included to estimate model parameters. They conclude that contact tracing data alone are sufficient to detect and quantify the heterogeneity in susceptibility, allowing early identification of differences in infection likelihood among individuals, which leads to improvements in epidemic modeling and public health interventions.

Hens et al. [20] propose what they call a Next-Generation Matrix (NGM) framework which combines contact matrices, derived from diary-based contact survey data, with previously known epidemiological parameters such as the transmission probability and the duration of infectiousness. The largest eigenvalue of the NGM represents the basic reproduction number. The study enhances the estimation of the basic reproduction number by incorporating professional contact data and refining age-specific interaction patterns.

2.3. Meeting Distribution-Based Models

Unlike agent-based and contact tracing-based models that focus on individual-level contact data, meeting distribution-based models rely on population-level interaction patterns. Veneti et al. [21] estimate age-stratified contact matrices—the average number of daily contacts between individuals of different age groups—and calculate reductions in the basic reproduction number using the next-generation method. Their study showed social distancing measures during COVID-19 reduced the basic reproduction number by 25% compared to what it would have been had there been no social distancing.

Mistry et al. [22] construct age-stratified contact matrices (saying how likely persons of different age groups would meet each other in a variety of locales) using socio-demographic data and POLYMOD surveys. They offer a case study of estimating the reproduction number of the H1N1 2009 epidemic in a variety of countries. They show that taking into account the variability in the distribution of contacts gives greater accuracy compared to a uniform mixing model.

Boyer et al. [23] develop a theoretical transmission model using empirical gathering size data from the BBC Pandemic and Copenhagen Networks Study (CNS). Their transmission model calculates the expected number of new infections at a certain gathering size based on transmission probability and population proportions of susceptible and infectious individuals. Their study shows that for COVID-19, gatherings would have to be severely restricted in size to prevent epidemic spread.

Munday et al. [24] construct an age-specific forecasting framework and apply it to two age-stratified time-series data; the first one is the incidence of SARS-CoV-2 infection estimated from the national infection and antibody prevalence survey, and the second one is the reported cases based on the UK COVID-19 dashboard. They predict new infections by using real-time contact patterns combined with a semi-mechanistic model that estimates how people of different ages interact and transmit the virus, accounting for changes in immunity, vaccination, and infection rates over time. Their study finds that incorporating age interaction can improve infection predictions, particularly in children and older adults, and the social contact data informed model performs the best in the short term (between 2–4 weeks) and the winter months in 2020–2021.

Franco et al. [25] present a method to analyze age-specific variations in transmission parameters associated with susceptibility and infectiousness to SARS-CoV-2 infection. They combine social contact data with the next-generation principle to estimate relative infection rates across different age groups. Their findings indicate that children, especially preschool-aged young children, have about half the susceptibility of adults. Although this study does not explicitly track individual-level meetings, it estimates contact distributions using diary-based social contact data, which serve as an aggregate representation of meeting patterns across age groups.

An important insight from the above excellent work is that knowing as much information as possible about meetings is helpful. Practically speaking, however, we find that basically no contact tracing data is available or, if it is, then only in small quantities. What is available is survey data indicating how many people each person meets.

3. Materials and Methods

EpiInfer assumes that the following is given: the total population, the number of symptomatic individuals (state I) over time, and an estimate of the number of daily contacts per asymptomatic individual (though not who meets whom).

Because we are not using a differential equation model, we are not trying to estimate the parameters of such a model (e.g.,

β

and

σ

). Instead, we try to estimate (and then use) three parameters.

$p_{1}$ the probability that a person in susceptible state S will transition to state E after encountering one person in an asymptomatic but contagious state E.
$p_{2}$ the probability that a person in state E will transition to symptomatic state I.
$i n c$ the number of days of incubation between the time a person enters state E and they enter state I.

To explain our strategy, we will start in Section 3.1 by explaining the EpiInfer-core method to infer infections given the meeting distribution and guesses of

p_{1}

,

p_{2}

, and

i n c

. Next, in Section 3.2, we describe ContinuousCalibrate, a grid/binary search method to find those parameter values: for each

p_{2}

between 0.1 and 1 in 0.1 increments, perform binary search on

p_{1}

to find a set of

p_{1}

and

p_{2}

values for which EpiInfer-core yields the smallest error for the last day. Finally, in Section 9.1, we describe an approach to use multiple more or less isolated locales to infer these parameters better.

3.1. EpiInfer-Core: Estimating Infection Rates from Basic Parameters

For given transition probabilities

p_{1}

,

p_{2}

, average incubation period

i n c

, and statistical information in the form of a distribution of the number of people each person meets each day, the algorithm has several components to predict how many people will be infected on a given day d, given the number of newly infected on previous days.

Anyone who entered the infected/symptomatic state I at time

d - i

, for some

i \leq i n c

, must have already been in the Exposed state at the start of time

d - i n c

. Thus, the probability

p a l r e a d y e x p_{d - i n c}

that an asymptomatic person is already in the exposed state at the start of time

d - i n c

is

p a l r e a d y e x p_{d - i n c} = (\sum_{i = 1}^{i n c} N e w I n f (d - i)) / (p_{2} \cdot a s y m p t o m a t i c (d - i n c))

(1)

where

a s y m p t o m a t i c (d - i n c)

is the number who were asymptomatic at time

d - i n c

(i.e., not in state I) and

N e w I n f (d - i)

is the number of people who enter the infected state on day

d - i

. Thus, the probability

p a l r e a d y e x p_{d - i n c}

is proportional to the number who became infected since

d - i n c

and inversely proportional to the whole population of the asymptomatic ones at

d - i n c

. The

p_{2}

divisor reflects the fact that not all who become exposed subsequently become infected, so if

p_{2}

is small, then few who become exposed later become infected.

So, if a person y meets

n_{y}

people at time

d - i n c

, then y will meet approximately

n_{y} \cdot

p a l r e a d y e x p_{d - i n c}

already exposed people. As a consequence, the probability that an asymptotic person y becomes exposed at

d - i n c

is

p n e w e x p o s e d (d - i n c, y) = 1 - {(1 - p_{1})}^{n_{y} \cdot p a l r e a d y e x p_{d - i n c}} .

(2)

To avoid double-counting the number of exposed people, we want to exclude those people who are already exposed at day

d - i n c

. There are

\sum_{i = 1}^{i n c} N e w I n f (d - i) / p_{2}

people who are already exposed as of that day.

a s y m p t o m a t i c N o t E x

denotes those who are asymptomatic, not exposed, but who could be exposed. Therefore,

\begin{matrix} a s y m p t o m a t i c N o t E x (d - i n c) = a s y m p t o m a t i c (d - i n c) \\ - (\sum_{i = 1}^{i n c} N e w I n f (d - i) / p_{2}) - (r e c o v e r e d (d - i n c)) . \end{matrix}

(3)

The last term reflects the fact that recovered people have immunity for a certain period of time. So, we subtract the number of recovered people at time

d - i n c

from those who could be exposed.

So, the expected number of newly exposed people at

d - i n c

is

N e w E x p o s e d (d - i n c) = \sum_{y i n a s y m p t o m a t i c N o t E x (d - i n c)} p n e w e x p o s e d (d - i n c, y)

(4)

where

p n e w e x p o s e d (d - i n c, y)

is given by Equation (2) and

a s y m p t o m a t i c N o t E x

is given by Equation (3).

To predict the number of newly infected people at time d we multiply the predicted number of newly exposed people at

d - i n c

by

p_{2}

:

N e w I n f (d) = N e w E x p o s e d (d - i n c) \times p_{2}

(5)

Algorithm 1 shows the pseudocode that reflects the above reasoning.

Algorithm 1 EpiInfer-core: Estimating Infection Rates, given meeting distribution information and guesses for

i n c

,

p_{1}

,

p_{2}

.

1:: function SIM( $p_{1}$ , $p_{2}$ )
2:: Compute the probability of being exposed as of time $d - i n c$ Equation (1).
3:: Compute the probability that a given susceptible individual y who meets $n_{y}$ other asymptomatic people will be exposed at time $d - i n c$ Equation (2).
4:: Compute an approximate number of those who are asymptomatic, not exposed, but could be exposed Equation (3).
5:: Compute the number of newly exposed people at $d - i n c$ Equation (4).
6:: Return the prediction of newly infected people at time d by multiplying the number of newly exposed people at time $d - i n c$ by $p_{2}$ Equation (5).
7:: end function

In summary, the goal of Algorithm 1 EpiInfer-core is to estimate the number of people who will be newly infected at time d. It starts by first calculating the probability that a susceptible person is already exposed at previous time point

d - i n c

in Equation (1). It then computes the probability that a susceptible individual meeting asymptomatic people becomes exposed in Equation (2). Next, it approximates the number of asymptomatic individuals who have not been exposed yet in Equation (3), but who have no immunity. Using this, it estimates the number of individuals newly exposed at time

d - i n c

in Equation (4). Finally, it predicts the number of new infections at time d by multiplying the newly exposed individuals at time

d - i n c

by

p_{2}

, the probability that a person who is exposed becomes infected in Equation (5).

3.2. ContinuousCalibrate: Systematic Search for $p_{1}$ and $p_{2}$

In the experiments with EpiPolicy generated data,

i n c

is taken to be the inverse of

σ

, which is the number of days to transition from the exposed state to the infected state. In experiments with real COVID-19 data, we used

i n c = 7

as a hyperparameter, as it is a common value for incubation periods and applies to COVID-19 which is our real data use case. If the incubation period was not known, the parameter sweep would be a three-level nested loop including

i n c

as well as

p_{1}

and

p_{2}

. We present the two-level nested loop case, where we know

i n c

.

For a given number k of training days, the

p_{1}

and

p_{2}

are selected so that they minimize the RMSE between the predicted and real number of infected people from day

k - 1

to k using EpiInfer-core. The search for

p_{2}

is conducted by iterating through a series of possible values, starting from 0.1 up to 1, with a step size of 0.1. For each of those values of

p_{2}

, the search for the best corresponding

p_{1}

is carried out in a manner similar to binary search. The initial middle point for

p_{1}

is 0.5, and if this value causes the predicted data to be larger than the real data, choose

p_{1}

of 0.25. Otherwise, choose

p_{1}

to be 0.75. This process continues until the search is terminated. When the search ends, we will use the

p_{1}

that leads to the lowest RMSE on the training data. Algorithm 2 shows the pseudocode of this approach.

Algorithm 2 Transition Parameter Inference Algorithm Using a Single Locale: For each subsequent training day, the goal is to find

p_{1}

and

p_{2}

values that minimize the Root Mean Squared Error (RMSE) of the number of infected people (in state I).

i n c

is taken to be seven days because that is common in epidemics, but this could be changed.

1:: function SEARCH
2:: for each $p_{2}$ value between 0.1 and 1.0 in increments of 0.1 do
3:: while doing binary search on $p_{1}$ do
4:: run Algorithm 1 (EpiInfer-core) based on current $p_{1}, p_{2}$ pair and $i n c$
5:: calculate RMSE for the last t training days
6:: Update best $p_{1}, p_{2}$ if lowest RMSE seen so far
7:: Update interval for $p_{1}$ as described in Section 3.2
8:: end while
9:: end for
10:: return best $p_{1}, p_{2}$ pair
11:: end function

Section 7 shows that the accuracy as measured in terms of Root Mean Squared Error of Infection Rate estimates varies with

p_{1}

,

i n c

, and

p_{2}

(in descending order of importance), sometimes significantly. The RMSE based on the parameter settings found by ContinuousCalibrate turn out to be the best overall as shown in Appendix B. Thus, ContinuousCalibrate is an important component of EpiInfer.

3.3. Comparison with the Differential Equation Based Model

The differential equation-based model of Figure 1 makes two assumptions underlying its differential equations.

The differential equation model treats all people in the Exposed state as having an equal likelihood of transitioning to the Infected state regardless of when they entered the Exposed state. This is Markovian in that it assumes that the next state depends only on the current state, but not on previous states. This could work if there were no incubation period, i.e., for an SIR model, but many diseases have an incubation period. That is the underlying reason why we think a non-Markovian model is better.
$β$ assumes a uniform meeting distribution of all people in the Susceptible and the Exposed classes. We make use of meeting distribution information when it is available. In Section 8, we see how much accuracy may be lost in EpiInfer by using only the average meeting distribution. The loss in accuracy is significant when the meeting distribution is bimodal for example but less so when uniform.

4. Single Locale Experiments

4.1. Other Methods

We selected ARIMA, a differential equation model with calibration, and LSTM as baseline models for comparison, choosing their hyperparameters based on those that demonstrated the best performance on both real and simulated data for 1-, 5-, 10-, and 20-day predictions. We implemented the ARIMA model using the statsmodels.tsa.arima.model library with hyperparameters

(p, d, q) = (1, 1, 1)

, which turned out to be the best parameter settings. For the differential equation model, we used the SEIR model and fitted parameters using the default Levenberg–Marquardt algorithm provided by the lmfit library. The LSTM model was constructed as a single-layer recurrent neural network with 64 hidden units, followed by a fully connected output layer. It was trained using the Adam optimizer with a learning rate of 0.01. We compared these models with one another and with EpiInfer. For each model M, to predict k days in the future, we apply M to predict one day out in the future and then use that prediction as if it were a real value to predict two days out, and we keep going until k days in the future.

4.2. Method Comparison on Simulation Data

We have made use of the simulation model of the sister project EpiPolicy [3]. The purpose of EpiPolicy is to help policymakers combat epidemics by allowing the policy makers to simulate different parameter settings in an SEIRS model and see the impact on death rates and other health indicators. In our use of EpiPolicy, we have separated the populations for each locale into three groups: children, adults, and seniors.

In the SEIRS model, there are four main parameters:

β

,

σ

,

γ

, and

η

. They represent the rate of transitioning from susceptible to exposed (

β

), the rate of transitioning from exposed to infected (

σ

), the recovery rate (

γ

), and the rate of transitioning from the recovered state back to susceptible (

η

).

In all of the EpiPolicy experiments, the values of

σ

,

γ

, and

η

are fixed at 0.1428, 0.1, and 0.1, respectively. The value of

σ

was chosen to be the reciprocal of 7, because, as noted above, seven days is a commonly seen duration of the incubation period in pandemics like COVID-19 [26].

What varies in these models, based on population type, is

β

. Because the number of daily contacts directly affects the value of the parameter

β

in the SEIRS model, the effect of different numbers of daily contacts was achieved by setting a different value of

β

for each age group. For this experiment, the value of

β

for children is 0.65. For adults, it is 0.75. For seniors, it is 0.5.

Table 1 shows the Root Mean Squared Error of the results for forecasting values from 1 to 20.

Figure 2 shows the predictions from the two best methods ARIMA and EpiInfer.

Figure 3, by contrast, shows the relative Root Mean Squared Error, RMSE, which is the square root of the squared relative errors (

{(\frac{\hat{y} - y}{y})}^{2}

, where

\hat{y}

is the prediction and y is the actual value) of the top two methods (EpiInfer and ARIMA) for days running between 11 and 70 days into the simulated pandemic. Predictions for each such day d are 1, 5, 10, and 20 days in the future (i.e.,

d + 1

,

d + 5

,

d + 10

,

d + 20

).

Both ARIMA and EpiInfer have lower Root Mean Squared Errors (RMSE) than LSTM or differential equation models regardless of how far in the future to forecast. Table 1 shows that ARIMA is better for forecasts up to 7 days in the future after which EpiInfer is better.

In Appendix A, we include comparisons with a graph neural network and a transformer model. Those methods perform similarly to LSTM and thus are inferior to both EpiInfer and ARIMA.

5. Experiments with CoMix/JHU Data on COVID-19

The experiment in this section used meeting data from CoMix [27] and health status data from the COVID-19 time-series data provided by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University [28].

The CoMix study collected information on social contact patterns, behaviors, and attitudes during the COVID-19 pandemic across multiple European countries. People were separated into the following groups based on their daily contacts: 0 contacts, 1–5 contacts, 6–10 contacts, 11–15 contacts, 16–20 contacts, 21–50 contacts, 51–100 contacts, and 100+ contacts.

The Johns Hopkins data, sourced from the JHU COVID-19 GitHub repository (https://github.com/CSSEGISandData/COVID-19, accessed on 25 November 2024), provided daily records of new confirmed cases, new recoveries, and new deaths for each country included in our analysis.

To conduct our analysis, we combined the two data sources. That is, we took the meeting distribution from the CoMix contact data and infection/recovered data from the COVID-19 health status data by country and time. Because our model follows a SEIRS framework, where recovered individuals transition back to the susceptible state, we did not explicitly account for deaths in the transmission dynamics. This is justified in the case of COVID-19 because the death rate was in fact quite low. If we did model death, then we would shrink the number who transitioned from R to S by however many died.

For this experiment, we used 70 data points, representing the daily number of infected individuals from Austria.

Figure 4 shows a visual comparison for CoMix/JHU data predictions for 1–20 days into the future.

Figure 5 shows the predictions from the two best methods ARIMA and EpiInfer.

Both ARIMA and EpiInfer are more accurate than LSTM or differential equation models for all prediction horizons. Table 2 shows a summary of the results for each forecasting value between 1 and 20 days into the future. Figure 6 shows the relative RMSE for the 1-, 5-, 10-, and 20-day predictions for a given number of training days (10–50). Consistent with Table 2, the figure shows visually that EpiInfer outperforms ARIMA for 10- and 20-day predictions, while ARIMA has a lower RMSE for 1-day predictions and they are about equal for 5-day predictions.

6. Experiments with BBC Data on the Flu

The dataset that we used for this experiment is the BBC Pandemic dataset [29], where people from various locations and age groups in the United Kingdom reported their daily number of contacts. The separation into groups was done in the same way as in the experiment on CoMix/JHU data: people were separated into the following groups based on their daily contacts: 1–5 contacts, 6–10 contacts, 11–15 contacts, 16–20 contacts, 21–50 contacts, 51–100 contacts, and 100+ contacts.

The time series information for people infected by the flu in the UK was taken from the Our World in Data website (https://ourworldindata.org/influenza, accessed on 30 June 2025) which provides weekly infection data. For this experiment, we used 11 data points, representing the weekly number of infected individuals.

Because EpiInfer works with daily data, we approximated the daily data by dividing the weekly infection values by 7. (A better approach might be to use splines, but we decided to keep the approach as simple as possible.) At that point, we could predict from week w to week

w + 1

by taking this generated daily data up to week w and then estimating the daily infection values seven days after w. Predicting the number of infections for weeks

w + 2

(14 days out) and

w + 3

(21 days out) was carried out using the same daily prediction method.

Table 3 shows the comparison of all the models on this data for predictions 7, 14, and 21 days into the future. EpiInfer performs the best, followed by ARIMA, though with a high p-value for 7 days.

7. Is Infection Prediction Sensitive to the Precise Values of Transition Parameters

To determine how sensitive the accuracy of infection prediction is to the particular values of transition parameters, we re-analyzed the CoMix/JHU COVID-19 and EpiPolicy data as follows. Recall that at each day d, ContinuousCalibrate estimates values of

p_{1}

and

p_{2}

based on the information known up to day d. Using that approach, Section 5 achieved certain Root Mean Squared Error values.

We now ask how the RMSE changes if the values of

i n c

,

p_{1}

, and

p_{2}

are changed.

Appendix B shows results from CoMix/JHU (Covid) data for 1-, 5-, 10-, 15-, and 20-day forecasts. That appendix also performs sensitivity analysis for EpiPolicy-generated data using parameters that are the same as in the single locale experiment of Section 4.2.

The appendix shows the quality (in terms of Root Mean Squared Error) of the predictions when changes are made to the parameters

p_{1}

,

p_{2}

, and

i n c

. There are 27 different values, corresponding to three possibilities for each parameter. For

p_{1}

and

p_{2}

, the considered values are 90%,100%, and 110% of the ContinuousCalibrate-inferred values. For

i n c

, the additive offset is either -3, 0, or 3. Table A13 shows that the parameter setting determined by ContinuousCalibrate has the lowest overall RMSE when considering both real and simulated data at 1-, 5-, 10-, 15-, and 20-day forecasts.

8. Using Average Number of Meetings Instead of Meeting Distribution

In this group of experiments, we test whether it is possible to use the mean number of meetings (computed by taking the number of meetings per person and dividing by the number of people) instead of a meeting distribution that specifies how many people each person meets on a daily basis.

8.1. EpiPolicy Data Regular Meeting Distribution Experiment

Figure 7 shows the comparison of these approaches for 1-, 5-, 10-, and 20-day predictions on EpiPolicy simulated data. Every parameter except

β

is the same as described in 4.2. The values of

β

for the three groups are 0.5, 0.9, and 0.15. The full distribution gives better results, but often not by much (the difference between the means of the relative RMSE is at most 0.03). In fact, there is no statistically significant difference between the two approaches for either short-term or long-term predictions.

8.2. EpiPolicy Data Bimodal Meeting Distribution Scenario

The next experiment represents a highly bimodal scenario, where 40% of people have a

β

value of 0, and 60% of people have

β = 1.5

.

Figure 8 shows the comparison of these approaches for 1-, 5-, 10-, and 20-day predictions.

For 1-day predictions, there is no statistically significant difference between knowing the full meeting distribution versus using the mean. For 5-day predictions, the difference between the means of the relative RMSE is 0.18, with a p-value of 0.0002. For 10-day predictions, the difference between the means is 0.49, and the p-value is less than

10^{- 5}

. For 20-day predictions, the difference between the means is 1.35 with a p-value of 0.0002.

8.3. EpiPolicy Data Gamma Meeting Distribution Experiment

The final series of experiments consists of three different meeting distributions, where each one follows a discretized version of the gamma distribution with the same mean (12), but with differing variances. The parameters

α

and

θ

are (12, 1), (2, 6), and (1, 12) respectively.

The positive effect of using EpiInfer with full meeting distribution data can best be seen when comparing the two approaches for 20-day predictions across all three meeting distributions.

For the meeting distribution with

α = 12

and

θ = 1

, the difference between the means is 0.12, with a p-value of 0.207. For

α = 2

and

θ = 6

, the difference between the means is 0.15, with a p-value of 0.1332. For

α = 1

and

θ = 12

, the difference between the means is 0.36, with a p-value of 0.024. Figure 9 shows the comparison of these approaches for 20-day predictions across all three

α

,

θ

pairs.

The last two experiments show that using EpiInfer with full meeting distribution data is most beneficial when the variance of the number of meetings per person is high. That is, using the mean number of meetings is significantly worse when the number of meetings differs a lot between population subgroups.

The conclusion we can draw from these experiments is that we should use the meeting distribution when available, but that the mean of the meeting distribution (i.e., the average number of people each person meets) is often nearly as good unless the meeting distribution is highly bimodal or strongly fractal.

8.4. CoMix/JHU Mean-of-Meetings Experiment

Figure 10 shows the comparison of full meeting distribution versus the mean number of meetings per person for 1-, 5-, 10-, and 20-day predictions on CoMix/JHU data. The two characterizations of meeting distributions show similar Root Mean Squared Errors (the difference between the means of the relative RMSE is less than 0.01), and there is no statistically significant difference between the two approaches either for short-term or long-term predictions.

9. Multiple Locales

9.1. Multilocale Inference

It is plausible that using data from multiple locales might help to infer

p_{1}

and

p_{2}

more accurately, assuming that these probabilities are independent of the locale. The goal of our experiments on this topic is to find out which is best: using one locale, similar (in terms of meeting distribution) locales, or all locales.

For the three locale choices, the search for

p_{1}

and

p_{2}

is carried out basically as described in Section 3.2 (apply EpiInfer-core to a systematic search among

p_{1}

and

p_{2}

values). The only difference is the way the RMSE is calculated. We are using relative RMSE from multiple locales instead of the RMSE from a single locale. Algorithm 3 shows the pseudocode of this approach.

Algorithm 3 Transition Parameter Inference Algorithm Using Multiple Locales: For each subsequent training day, the goal is to find

p_{1}

and

p_{2}

values that minimize the relative Root Mean Squared Error (RMSE) of the number of infected people (in state I). Here the optimization is done over all locales.

1:: function SEARCH(locales)
2:: for each $p_{2}$ value between 0.1 and 1.0 in increments of 0.1 do
3:: while doing binary search on $p_{1}$ do
4:: for each locale L in locales do
5:: run Algorithm 1 (EpiInfer-core) based on current $p_{1}, p_{2}$ pair
6:: calculate relative RMSE for the last t training days
7:: end for
8:: Calculate relative RMSE over all locales for the last t training days
9:: Update best $p_{1}, p_{2}$ if lowest relative RMSE seen so far
10:: Update interval for $p_{1}$ as described in Section 3.2
11:: end while
12:: end for
13:: return best $p_{1}, p_{2}$ pair
14:: end function

The algorithm for multi-locales, Algorithm 3, searches for the best transition parameters

p_{1}

and

p_{2}

that minimize the relative RMSE of infected individuals across multiple locales.

Given a set of n true values (coming from n different locales)

y_{1}, y_{2}, \dots, y_{n}

and corresponding predicted values

{\hat{y}}_{1}, {\hat{y}}_{2}, \dots, {\hat{y}}_{n}

, the relative RMSE is defined as:

Relative RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(\frac{{\hat{y}}_{i} - y_{i}}{y_{i}})}^{2}}

(6)

Algorithm 3 loops through different

p_{2}

values (0.1 to 1.0) and performs a binary search on

p_{1}

for each

p_{2}

. For each pair of

p_{1}

and

p_{2}

, it runs the EpiInfer-core (Algorithm 1) on each locale, calculates the relative RMSE over the last t training days, and updates the best parameter pair if a lower relative RMSE is found. For the following experiment, the value of t is one. Finally, it returns the

p_{1}

and

p_{2}

pair with the lowest relative RMSE.

9.2. Multilocale Experiment with EpiPolicy Data

This experiment tests whether making use of data from multiple locales can lead to better results than using data from just a single locale. Figure 11 shows the comparison of three approaches—single locale, similar locales (having similar meeting distributions), and locales having different meeting distributions—for 1- to 20-day predictions. Because using locales having different meeting distributions leads to worse results on average than using similar locales, all our subsequent comparisons will be between using similar locales (which we will call EpiInfer-multi) and EpiInfer on a single locale.

Table 4 shows that both EpiInfer and EpiInfer-multi improve over ARIMA for forecasts more than 7 days in the future. EpiInfer-multi already has a statistically significantly lower Root Mean Squared Error (RMSE) for forecasts of four days in the future or more. EpiInfer on multiple locales having similar meeting distributions is more accurate than EpiInfer on a single locale no matter what the forecasting horizon is.

Figure 12 shows a visual comparison for EpiPolicy data predictions from 1–20 days into the future.

Figure 13 shows predictions from the two best methods ARIMA and EpiInfer-multi using locales having similar meeting distributions.

Figure 14 shows a comparison of ARIMA and EpiInfer-multi that uses data from similar locales. There is a notable difference compared to Figure 3. EpiInfer with similar locales has a lower mean relative RMSE than ARIMA for 5-, 10-, and 20-day predictions, while showing worse performance for 1-day predictions.

10. Potential Deployment Scenarios

The motivation for this work came from an issue one of us (Shasha) encountered when trying to give computational support through the tool EpiPolicy during the COVID-19 epidemic to the policy makers of a certain country. While EpiPolicy could give highly targeted policy advice (e.g., regarding whom to vaccinate, how much social distancing to practice), it could do that only with the knowledge of how each intervention would affect infection rates. In simulation, this worked fine when we knew the effect on transition parameters of any given change. However, such knowledge is very hard to know a priori.

Thus, the primary use cases for EpiInfer are (i) to predict infection rates as the disease mutates and/or after counter-measures have been applied, (ii) be able to simulate infection rates based on changes in transition parameters

p_{1}

and

p_{2}

, and (iii) be able to learn transition parameters after an intervention has taken place or as a disease mutates.

Next time an epidemic breaks out, EpiInfer will be there to help.

Scalability and Large-Scale/Real-Time Applications

The EpiPolicy data experiment with two locales takes less than 20 s to run. The single locale experiment that has a runtime of under 10 s. Because the number of used locales for future use will most likely be 10 or fewer, scalability should not be an issue. The CoMix/JHU data experiment has a runtime of roughly 21 s. Thus, the model is easily fast enough for large-scale applications in which decisions are implemented over days not seconds.

11. Conclusions and Future Work

This paper proposes, describes, and evaluates a new non-Markovian infection prediction algorithm called EpiInfer which takes advantage of history and meeting distributions to give higher accuracy for medium- to long-term (roughly 7–20 day) forecasts than the main prominent state-of-the-art models (ARIMA, LSTM, transformer, graph neural network, and Differential Equation models with calibration). These results apply to both simulated data and real data involving COVID-19 and flu epidemics.

Because the results show that ARIMA offers better predictions for short-term predictions but EpiInfer for longer-term predictions, we suggest that, in practice, ARIMA(1,1,1) should be used for forecasts of 1 to 6 days and EpiInfer for longer forecasts.

Other results of our study show that

The re-calculation of the transition parameter values every day ( $p_{1}$ and $p_{2}$ ) and the proper setting of the incubation period $i n c$ help achieve a low RMSE. Poor estimations of $p_{1}$ in particular may worsen the result substantially.
On simulated data, if several locales have similar meeting distributions, then EpiInfer can obtain better results when ContinuousCalibrate infers transition parameters $p_{1}$ and $p_{2}$ on all those locales at the same time (EpiInfer-multi).
On both real and simulated data, we have found that knowing just the mean of the number of meetings of each person gives results that are nearly as good as knowing the full distribution unless the variance in meeting distribution is very large.

Future Work

We envision both algorithmic and application future work. Future algorithmic work includes building and testing an algorithm that can be applied to a setting where contact tracing and symptomatic data are available. Future applications work entails applying EpiInfer to a variety of epidemic-style diseases.

Author Contributions

Conceptualization, D.S.; Data curation, J.K. and R.Y.; Formal analysis, J.K. and D.S.; Investigation, J.K. and D.S.; Methodology, J.K. and D.S.; Resources, J.K. and R.Y.; Software, J.K. and R.Y.; Validation, J.K., R.Y. and D.S.; Writing—original draft, J.K., R.Y. and D.S.; Writing—review and editing, J.K., R.Y. and D.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partly funded by NYU Wireless and US NIH National Institute of General Medical Sciences 5R01GM121753-04 NCE and US National Science Foundation 1840761 NCE 2.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The software and data can be found at the GitHub repository https://github.com/JovanK26/EpiInfer (accessed on 10 July 2025).

Acknowledgments

The authors would like to acknowledge helpful discussions regarding this problem with Simona Avesani, Andrea Betti, Rosalba Giugno, Alisa Kumbara, and Lorenzo Ruggeri of the University of Verona, Vincenzo Bonnici of the University of Parma, Roberto Grasso of the University of Catania, and Jorge Rodriguez of Khalifa University.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Appendix A. Comparison with Transformer and Graph Neural Net on Real and Simulated Data

Transformers and Graph Neural Nets on Comix/JHU and EpiPolicy data show Root Mean Squared Errors that are similar to those of LSTMs and thus greater than those of EpiInfer. Table A1 and Table A2 show a comparison of these two models to EpiInfer.

Table A1. Root Mean Squared Error (RMSE) of predictions from 1 to 20 days using Transformer, GNN, and EpiInfer on CoMix/JHU data.

Days Predicted	Transformer	GNN	EpiInfer
1	0.1060	0.1558	0.0892
2	0.2011	0.2146	0.1501
3	0.2819	0.2729	0.1880
4	0.3504	0.3346	0.2064
5	0.4102	0.3857	0.2233
6	0.4656	0.4418	0.2430
7	0.5228	0.4788	0.2632
8	0.5788	0.5154	0.2673
9	0.6252	0.5678	0.2681
10	0.6757	0.6069	0.2660
11	0.7268	0.6440	0.2616
12	0.7933	0.7215	0.2576
13	0.8527	0.7808	0.2567
14	0.9000	0.8250	0.2605
15	0.9828	0.9051	0.2675
16	1.0496	1.2862	0.2786
17	1.1412	1.0498	0.2922
18	1.2267	1.1628	0.3066
19	1.3160	1.2116	0.3203
20	1.4226	1.3196	0.3351

Table A2. Root Mean Squared Error (RMSE) of predictions from 1 to 20 days using Transformer, GNN, and EpiInfer on EpiPolicy generated data.

Days Predicted	Transformer	GNN	EpiInfer Multilocale
1	0.0803	0.1638	0.0252
2	0.1684	0.2260	0.0341
3	0.2361	0.2934	0.0413
4	0.2963	0.3509	0.0509
5	0.3514	0.3975	0.0609
6	0.3846	0.4375	0.0759
7	0.4189	0.4640	0.0909
8	0.4428	0.4873	0.1013
9	0.4646	0.5119	0.1119
10	0.4817	0.5290	0.1269
11	0.4985	0.5469	0.1378
12	0.5125	0.5530	0.1504
13	0.5208	0.5606	0.1636
14	0.5280	0.5704	0.1722
15	0.5314	0.5697	0.1819
16	0.5385	0.5775	0.1924
17	0.5450	0.5812	0.1984
18	0.5496	0.5808	0.2025
19	0.5458	0.5808	0.2078
20	0.5475	0.5833	0.2130

Appendix B. Sensitivity of Infection Prediction to Parameter Estimation

This appendix determines the sensitivity of RMSE to variations of estimates of

i n c

,

p_{1}

, and

p_{2}

on two datasets CoMix/JHU and EpiPolicy single locale.

Table A3, Table A4, Table A5, Table A6 and Table A7 show CoMix/JHU data root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier. Table A8, Table A9, Table A10, Table A11 and Table A12 show the same for EpiPolicy data.

Table A3. CoMix/JHU data 1-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier. We bold the minimum RMSE in this and subsequent tables.

Table A3. CoMix/JHU data 1-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier. We bold the minimum RMSE in this and subsequent tables.

#	Inc Offset	$p_{1}$ Multiplier	$p_{2}$ Multiplier	RMSE
1	$- 3$	0.9	0.9	0.0532
2	$- 3$	0.9	1.0	0.0527
3	$- 3$	0.9	1.1	0.0522
4	$- 3$	1.0	0.9	0.0800
5	$- 3$	1.0	1.0	0.0808
6	$- 3$	1.0	1.1	0.0815
7	$- 3$	1.1	0.9	0.1777
8	$- 3$	1.1	1.0	0.1786
9	$- 3$	1.1	1.1	0.1793
10	0	0.9	0.9	0.0476
11	0	0.9	1.0	0.0470
12	0	0.9	1.1	0.0466
13	0	1.0	0.9	0.0881
14	0	1.0	1.0	0.0892
15	0	1.0	1.1	0.0902
16	0	1.1	0.9	0.1864
17	0	1.1	1.0	0.1877
18	0	1.1	1.1	0.1888
19	3	0.9	0.9	0.0491
20	3	0.9	1.0	0.0485
21	3	0.9	1.1	0.0480
22	3	1.0	0.9	0.0996
23	3	1.0	1.0	0.1008
24	3	1.0	1.1	0.1018
25	3	1.1	0.9	0.1965
26	3	1.1	1.0	0.1978
27	3	1.1	1.1	0.1989

Table A4. CoMix/JHU data 5-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

Table A4. CoMix/JHU data 5-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

#	Inc Offset	$p_{1}$ Multiplier	$p_{2}$ Multiplier	RMSE
1	$- 3$	0.9	0.9	0.1955
2	$- 3$	0.9	1.0	0.1952
3	$- 3$	0.9	1.1	0.1949
4	$- 3$	1.0	0.9	0.1700
5	$- 3$	1.0	1.0	0.1703
6	$- 3$	1.0	1.1	0.1706
7	$- 3$	1.1	0.9	0.1859
8	$- 3$	1.1	1.0	0.1867
9	$- 3$	1.1	1.1	0.1873
10	0	0.9	0.9	0.1729
11	0	0.9	1.0	0.1739
12	0	0.9	1.1	0.1747
13	0	1.0	0.9	0.2220
14	0	1.0	1.0	0.2233
15	0	1.0	1.1	0.2245
16	0	1.1	0.9	0.3144
17	0	1.1	1.0	0.3160
18	0	1.1	1.1	0.3173
19	3	0.9	0.9	0.2189
20	3	0.9	1.0	0.2203
21	3	0.9	1.1	0.2215
22	3	1.0	0.9	0.3099
23	3	1.0	1.0	0.3117
24	3	1.0	1.1	0.3131
25	3	1.1	0.9	0.4214
26	3	1.1	1.0	0.4233
27	3	1.1	1.1	0.4249

Table A5. CoMix/JHU data 10-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

Table A5. CoMix/JHU data 10-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

#	Inc Offset	$p_{1}$ Multiplier	$p_{2}$ Multiplier	RMSE
1	$- 3$	0.9	0.9	0.4492
2	$- 3$	0.9	1.0	0.4486
3	$- 3$	0.9	1.1	0.4481
4	$- 3$	1.0	0.9	0.3700
5	$- 3$	1.0	1.0	0.3693
6	$- 3$	1.0	1.1	0.3688
7	$- 3$	1.1	0.9	0.2896
8	$- 3$	1.1	1.0	0.2893
9	$- 3$	1.1	1.1	0.2889
10	0	0.9	0.9	0.2597
11	0	0.9	1.0	0.2596
12	0	0.9	1.1	0.2595
13	0	1.0	0.9	0.2650
14	0	1.0	1.0	0.2660
15	0	1.0	1.1	0.2668
16	0	1.1	0.9	0.3047
17	0	1.1	1.0	0.3058
18	0	1.1	1.1	0.3068
19	3	0.9	0.9	0.3542
20	3	0.9	1.0	0.3557
21	3	0.9	1.1	0.3569
22	3	1.0	0.9	0.4350
23	3	1.0	1.0	0.4369
24	3	1.0	1.1	0.4384
25	3	1.1	0.9	0.5529
26	3	1.1	1.0	0.5552
27	3	1.1	1.1	0.5572

Table A6. CoMix/JHU data 15-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

Table A6. CoMix/JHU data 15-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

#	Inc Offset	$p_{1}$ Multiplier	$p_{2}$ Multiplier	RMSE
1	$- 3$	0.9	0.9	0.5884
2	$- 3$	0.9	1.0	0.5880
3	$- 3$	0.9	1.1	0.5876
4	$- 3$	1.0	0.9	0.5278
5	$- 3$	1.0	1.0	0.5272
6	$- 3$	1.0	1.1	0.5267
7	$- 3$	1.1	0.9	0.4608
8	$- 3$	1.1	1.0	0.4601
9	$- 3$	1.1	1.1	0.4596
10	0	0.9	0.9	0.3447
11	0	0.9	1.0	0.3436
12	0	0.9	1.1	0.3428
13	0	1.0	0.9	0.2680
14	0	1.0	1.0	0.2675
15	0	1.0	1.1	0.2671
16	0	1.1	0.9	0.2414
17	0	1.1	1.0	0.2417
18	0	1.1	1.1	0.2419
19	3	0.9	0.9	0.3185
20	3	0.9	1.0	0.3193
21	3	0.9	1.1	0.3201
22	3	1.0	0.9	0.3918
23	3	1.0	1.0	0.3938
24	3	1.0	1.1	0.3954
25	3	1.1	0.9	0.5409
26	3	1.1	1.0	0.5438
27	3	1.1	1.1	0.5462

Table A7. CoMix/JHU data 20-day root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

Table A7. CoMix/JHU data 20-day root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

#	Inc Offset	$p_{1}$ Multiplier	$p_{2}$ Multiplier	RMSE
1	$- 3$	0.9	0.9	0.6730
2	$- 3$	0.9	1.0	0.6726
3	$- 3$	0.9	1.1	0.6723
4	$- 3$	1.0	0.9	0.6241
5	$- 3$	1.0	1.0	0.6236
6	$- 3$	1.0	1.1	0.6232
7	$- 3$	1.1	0.9	0.5686
8	$- 3$	1.1	1.0	0.5681
9	$- 3$	1.1	1.1	0.5676
10	0	0.9	0.9	0.4478
11	0	0.9	1.0	0.4468
12	0	0.9	1.1	0.4460
13	0	1.0	0.9	0.3362
14	0	1.0	1.0	0.3351
15	0	1.0	1.1	0.3342
16	0	1.1	0.9	0.2378
17	0	1.1	1.0	0.2368
18	0	1.1	1.1	0.2360
19	3	0.9	0.9	0.2918
20	3	0.9	1.0	0.2921
21	3	0.9	1.1	0.2922
22	3	1.0	0.9	0.3876
23	3	1.0	1.0	0.3890
24	3	1.0	1.1	0.3902
25	3	1.1	0.9	0.5808
26	3	1.1	1.0	0.5839
27	3	1.1	1.1	0.5865

Table A8. EpiPolicy data 1-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

Table A8. EpiPolicy data 1-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

#	Inc Offset	$p_{1}$ Multiplier	$p_{2}$ Multiplier	RMSE
1	$- 3$	0.9	0.9	0.2402
2	$- 3$	0.9	1.0	0.2176
3	$- 3$	0.9	1.1	0.2046
4	$- 3$	1.0	0.9	0.0641
5	$- 3$	1.0	1.0	0.0440
6	$- 3$	1.0	1.1	0.0363
7	$- 3$	1.1	0.9	0.3278
8	$- 3$	1.1	1.0	0.3228
9	$- 3$	1.1	1.1	0.3342
10	0	0.9	0.9	0.1880
11	0	0.9	1.0	0.2329
12	0	0.9	1.1	0.2828
13	0	1.0	0.9	0.0633
14	0	1.0	1.0	0.0420
15	0	1.0	1.1	0.1664
16	0	1.1	0.9	0.2441
17	0	1.1	1.0	0.2366
18	0	1.1	1.1	0.3285
19	3	0.9	0.9	0.1804
20	3	0.9	1.0	0.2305
21	3	0.9	1.1	0.3428
22	3	1.0	0.9	0.0911
23	3	1.0	1.0	0.0567
24	3	1.0	1.1	0.2367
25	3	1.1	0.9	0.3081
26	3	1.1	1.0	0.2995
27	3	1.1	1.1	0.3965

Table A9. EpiPolicy data 5-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

Table A9. EpiPolicy data 5-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

#	Inc Offset	$p_{1}$ Multiplier	$p_{2}$ Multiplier	RMSE
1	$- 3$	0.9	0.9	0.3216
2	$- 3$	0.9	1.0	0.3046
3	$- 3$	0.9	1.1	0.2915
4	$- 3$	1.0	0.9	0.2022
5	$- 3$	1.0	1.0	0.1934
6	$- 3$	1.0	1.1	0.1862
7	$- 3$	1.1	0.9	0.5013
8	$- 3$	1.1	1.0	0.5111
9	$- 3$	1.1	1.1	0.5259
10	0	0.9	0.9	0.2896
11	0	0.9	1.0	0.3333
12	0	0.9	1.1	0.3783
13	0	1.0	0.9	0.1847
14	0	1.0	1.0	0.1650
15	0	1.0	1.1	0.2797
16	0	1.1	0.9	0.3881
17	0	1.1	1.0	0.3796
18	0	1.1	1.1	0.4665
19	3	0.9	0.9	0.2648
20	3	0.9	1.0	0.3324
21	3	0.9	1.1	0.4530
22	3	1.0	0.9	0.1867
23	3	1.0	1.0	0.1628
24	3	1.0	1.1	0.3558
25	3	1.1	0.9	0.3968
26	3	1.1	1.0	0.3888
27	3	1.1	1.1	0.4978

Table A10. EpiPolicy data 10-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

Table A10. EpiPolicy data 10-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

#	Inc Offset	$p_{1}$ Multiplier	$p_{2}$ Multiplier	RMSE
1	$- 3$	0.9	0.9	0.3562
2	$- 3$	0.9	1.0	0.3428
3	$- 3$	0.9	1.1	0.3316
4	$- 3$	1.0	0.9	0.2915
5	$- 3$	1.0	1.0	0.2886
6	$- 3$	1.0	1.1	0.2860
7	$- 3$	1.1	0.9	0.5319
8	$- 3$	1.1	1.0	0.5592
9	$- 3$	1.1	1.1	0.5895
10	0	0.9	0.9	0.3211
11	0	0.9	1.0	0.3743
12	0	0.9	1.1	0.4239
13	0	1.0	0.9	0.2410
14	0	1.0	1.0	0.2270
15	0	1.0	1.1	0.3552
16	0	1.1	0.9	0.4509
17	0	1.1	1.0	0.4486
18	0	1.1	1.1	0.5459
19	3	0.9	0.9	0.3304
20	3	0.9	1.0	0.3932
21	3	0.9	1.1	0.5024
22	3	1.0	0.9	0.2627
23	3	1.0	1.0	0.2524
24	3	1.0	1.1	0.4566
25	3	1.1	0.9	0.4592
26	3	1.1	1.0	0.4535
27	3	1.1	1.1	0.5084

Table A11. EpiPolicy data 15-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

Table A11. EpiPolicy data 15-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

#	Inc Offset	$p_{1}$ Multiplier	$p_{2}$ Multiplier	RMSE
1	$- 3$	0.9	0.9	0.3705
2	$- 3$	0.9	1.0	0.3566
3	$- 3$	0.9	1.1	0.3446
4	$- 3$	1.0	0.9	0.3126
5	$- 3$	1.0	1.0	0.3151
6	$- 3$	1.0	1.1	0.3168
7	$- 3$	1.1	0.9	0.3960
8	$- 3$	1.1	1.0	0.4125
9	$- 3$	1.1	1.1	0.4304
10	0	0.9	0.9	0.3482
11	0	0.9	1.0	0.3911
12	0	0.9	1.1	0.4423
13	0	1.0	0.9	0.2889
14	0	1.0	1.0	0.2802
15	0	1.0	1.1	0.4088
16	0	1.1	0.9	0.4592
17	0	1.1	1.0	0.4665
18	0	1.1	1.1	0.5723
19	3	0.9	0.9	0.3468
20	3	0.9	1.0	0.4145
21	3	0.9	1.1	0.5450
22	3	1.0	0.9	0.3036
23	3	1.0	1.0	0.2808
24	3	1.0	1.1	0.5077
25	3	1.1	0.9	0.5053
26	3	1.1	1.0	0.4931
27	3	1.1	1.1	0.5979

Table A12. EpiPolicy data 20-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

Table A12. EpiPolicy data 20-day prediction root mean squared error (RMSE) values for combinations of inc offset,

p_{1}

multiplier, and

p_{2}

multiplier.

#	Inc Offset	$p_{1}$ Multiplier	$p_{2}$ Multiplier	RMSE
1	$- 3$	0.9	0.9	0.3788
2	$- 3$	0.9	1.0	0.3631
3	$- 3$	0.9	1.1	0.3469
4	$- 3$	1.0	0.9	0.3074
5	$- 3$	1.0	1.0	0.2971
6	$- 3$	1.0	1.1	0.2923
7	$- 3$	1.1	0.9	0.3045
8	$- 3$	1.1	1.0	0.3042
9	$- 3$	1.1	1.1	0.3077
10	0	0.9	0.9	0.3361
11	0	0.9	1.0	0.3807
12	0	0.9	1.1	0.4238
13	0	1.0	0.9	0.2872
14	0	1.0	1.0	0.2816
15	0	1.0	1.1	0.4065
16	0	1.1	0.9	0.3885
17	0	1.1	1.0	0.3925
18	0	1.1	1.1	0.4860
19	3	0.9	0.9	0.3662
20	3	0.9	1.0	0.4303
21	3	0.9	1.1	0.5244
22	3	1.0	0.9	0.3381
23	3	1.0	1.0	0.3153
24	3	1.0	1.1	0.5499
25	3	1.1	0.9	0.5213
26	3	1.1	1.0	0.5121
27	3	1.1	1.1	0.5815

Table A13. Parameter settings ordered by best average rank determined by smallest RMSE on both real and simulated data. The default setting (14) has the best overall ranking, even though, for many forecast days, it is not the best. No other setting is consistently better.

Parameter Setting
14 ( $i n c$ , 1.0· $p_{1}$ , 1.0· $p_{2}$ )
13 ( $i n c$ , 1.0· $p_{1}$ , 0.9· $p_{2}$ )
10 ( $i n c$ , 0.9· $p_{1}$ , 0.9· $p_{2}$ )
6 ( $i n c$ -3, 1.0· $p_{1}$ , 1.1· $p_{2}$ )
19 ( $i n c$ +3, 0.9· $p_{1}$ , 0.9· $p_{2}$ )
5 ( $i n c$ -3, 1.0· $p_{1}$ , 1.0· $p_{2}$ )
23 ( $i n c$ +3, 1.0· $p_{1}$ , 1.0· $p_{2}$ )
11 ( $i n c$ , 0.9· $p_{1}$ , 1.0· $p_{2}$ )
4 ( $i n c$ -3, 1.0· $p_{1}$ , 0.9· $p_{2}$ )
22 ( $i n c$ +3, 1.0· $p_{1}$ , 0.9· $p_{2}$ )
15 ( $i n c$ , 1.0· $p_{1}$ , 1.1· $p_{2}$ )
12 ( $i n c$ , 0.9· $p_{1}$ , 1.1· $p_{2}$ )
20 ( $i n c$ +3, 0.9· $p_{1}$ , 1.0· $p_{2}$ )
3 ( $i n c$ -3, 0.9· $p_{1}$ , 1.1· $p_{2}$ )
16 ( $i n c$ , 1.1· $p_{1}$ , 0.9· $p_{2}$ )
17 ( $i n c$ , 1.1· $p_{1}$ , 1.0· $p_{2}$ )
2 ( $i n c$ -3, 0.9· $p_{1}$ , 1.0· $p_{2}$ )
7 ( $i n c$ -3, 1.1· $p_{1}$ , 0.9· $p_{2}$ )
8 ( $i n c$ -3, 1.1· $p_{1}$ , 1.0· $p_{2}$ )
1 ( $i n c$ -3, 0.9· $p_{1}$ , 0.9· $p_{2}$ )
21 ( $i n c$ +3, 0.9· $p_{1}$ , 1.1· $p_{2}$ )
9 ( $i n c$ -3, 1.1· $p_{1}$ , 1.1· $p_{2}$ )
18 ( $i n c$ , 1.1· $p_{1}$ , 1.1· $p_{2}$ )
24 ( $i n c$ +3, 1.0· $p_{1}$ , 1.1· $p_{2}$ )
26 ( $i n c$ +3, 1.1· $p_{1}$ , 1.0· $p_{2}$ )
25 ( $i n c$ +3, 1.1· $p_{1}$ , 0.9· $p_{2}$ )
27 ( $i n c$ +3, 1.1· $p_{1}$ , 1.1· $p_{2}$ )

References

Kermark, M.; Mckendrick, A. Contributions to the mathematical theory of epidemics. Part I. Proc. R. Soc. A 1927, 115, 700–721. [Google Scholar]
Gupta, N.; Mai, A.; Abouzied, A.; Shasha, D. On the calibration of compartmental epidemiological models. arXiv 2023, arXiv:2312.05456. [Google Scholar] [CrossRef]
Mai, A.L.X.; Mannino, M.; Tariq, Z.; Abouzied, A.; Shasha, D. Epipolicy: A tool for combating epidemics. XRDS Crossroads ACM Mag. Stud. 2022, 28, 24–29. [Google Scholar] [CrossRef]
Moré, J.J. The Levenberg-Marquardt algorithm: Implementation and theory. In Numerical Analysis, Proceedings of the Biennial Conference, Dundee, UK, 28 June–1 July 1977; Springer: Berlin/Heidelberg, Germany, 1978. [Google Scholar]
Branch, M.A.; Coleman, T.F.; Li, Y. A subspace, interior, and conjugate gradient method for large-scale bound-constrained minimization problems. SIAM J. Sci. Comput. 1999, 21, 1–23. [Google Scholar] [CrossRef]
Nash, S.G. A survey of truncated-Newton methods. J. Comput. Appl. Math. 2000, 124, 45–59. [Google Scholar] [CrossRef]
Byrd, R.H.; Lu, P.; Nocedal, J.; Zhu, C. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 1995, 16, 1190–1208. [Google Scholar] [CrossRef]
Powell, M.J. An efficient method for finding the minimum of a function of several variables without calculating derivatives. Comput. J. 1964, 7, 155–162. [Google Scholar] [CrossRef]
Storn, R.; Price, K. Differential evolution—A simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 1997, 11, 341–359. [Google Scholar] [CrossRef]
Lasdon, L.; Duarte, A.; Glover, F.; Laguna, M.; Martí, R. Adaptive memory programming for constrained global optimization. Comput. Oper. Res. 2010, 37, 1500–1509. [Google Scholar] [CrossRef]
Olson, B.; Hashmi, I.; Molloy, K.; Shehu, A. Basin hopping as a general and versatile optimization framework for the characterization of biological macromolecules. Adv. Artif. Intell. 2012, 2012, 3. [Google Scholar] [CrossRef]
Bertsimas, D.; Tsitsiklis, J. Simulated annealing. Stat. Sci. 1993, 8, 10–15. [Google Scholar] [CrossRef]
Xiang, Y.; Sun, D.; Fan, W.; Gong, X. Generalized simulated annealing algorithm and its application to the Thomson model. Phys. Lett. A 1997, 233, 216–220. [Google Scholar] [CrossRef]
Hunter, E.; Mac Namee, B.; Kelleher, J. A Hybrid Agent-Based and Equation Based Model for the Spread of Infectious Diseases. J. Artif. Soc. Soc. Simul. 2020, 23, 14. [Google Scholar] [CrossRef]
Eubank, S.; Guclu, H.; Kumar, V.S.A.; Marathe, M.V.; Srinivasan, A.; Toroczkai, Z.; Wang, N. Modelling disease outbreaks in realistic urban social networks. Nature 2004, 429, 180–184. [Google Scholar] [CrossRef]
Venkatramanan, S.; Lewis, B.; Chen, J.; Higdon, D.; Vullikanti, A.; Marathe, M. Using data-driven agent-based models for forecasting emerging infectious diseases. Epidemics 2018, 22, 43–49. [Google Scholar] [CrossRef]
Muntoni, A.P.; Mazza, F.; Braunstein, A.; Catania, G.; Dall’Asta, L. Effectiveness of probabilistic contact tracing in epidemic containment: The role of superspreaders and transmission path reconstruction. PNAS Nexus 2024, 3, 377. [Google Scholar] [CrossRef]
Kim, J.; Bidokhti, S.S.; Sarkar, S. Capturing COVID-19 spread and interplay with multi-hop contact tracing intervention. PLoS ONE 2023, 18, e0288394. [Google Scholar] [CrossRef]
Tuschhoff, B.M.; Kennedy, D.A. Detecting and quantifying heterogeneity in susceptibility using contact tracing data. PLoS Comput. Biol. 2024, 20, e1012310. [Google Scholar] [CrossRef]
Hens, N.; Goeyvaerts, N.; Aerts, M.; Shkedy, Z.; Van Damme, P.; Beutels, P. Mining social mixing patterns for infectious disease models based on a two-day population survey in Belgium. BMC Infect. Dis. 2009, 9, 5. [Google Scholar] [CrossRef]
Veneti, L.; Robberstad, B.; Steens, A.; Forland, F.; Winje, B.A.; Vestrheim, D.F.; Jarvis, C.I.; Gimma, A.; Edmunds, W.J.; Van Zandvoort, K.; et al. Social contact patterns during the early COVID-19 pandemic in Norway: Insights from a panel study, April to September 2020. BMC Public Health 2024, 24, 1438. [Google Scholar] [CrossRef]
Mistry, D.; Litvinova, M.; Pastore Y Piontti, A.; Chinazzi, M.; Fumanelli, L.; Gomes, M.F.; Haque, S.A.; Liu, Q.H.; Mu, K.; Xiong, X.; et al. Inferring high-resolution human mixing patterns for disease modeling. Nat. Commun. 2021, 12, 323. [Google Scholar] [CrossRef] [PubMed]
Boyer, C.B.; Rumpler, E.; Kissler, S.M.; Lipsitch, M. Infectious disease dynamics and restrictions on social gathering size. Epidemics 2022, 40, 100620. [Google Scholar] [CrossRef] [PubMed]
Munday, J.D.; Abbott, S.; Meakin, S.; Funk, S. Evaluating the use of social contact data to produce age-specific short-term forecasts of SARS-CoV-2 incidence in England. PLoS Comput. Biol. 2023, 19, e1011453. [Google Scholar] [CrossRef] [PubMed]
Franco, N.; Coletti, P.; Willem, L.; Angeli, L.; Lajot, A.; Abrams, S.; Beutels, P.; Faes, C.; Hens, N. Inferring age-specific differences in susceptibility to and infectiousness upon SARS-CoV-2 infection based on Belgian social contact data. PLoS Comput. Biol. 2022, 18, e1009965. [Google Scholar] [CrossRef]
Wu, Y.; Kang, L.; Guo, Z.; Liu, J.; Liu, M.; Liang, W. Incubation period of COVID-19 caused by unique SARS-CoV-2 strains: A systematic review and meta-analysis. JAMA Netw. Open 2022, 5, e2228008. [Google Scholar] [CrossRef]
Hoang, T.; Coletti, P.; Melegaro, A.; Wallinga, J.; Grijalva, C.; Edmunds, W.J.; Beutels, P.; Hens, N. Social Contact Data Repository, 2019. A Systematic Review of Social Contact Surveys to Inform Transmission Models of Close-contact Infections. Epidemiology 2019, 30, 723–736. Available online: https://www.socialcontactdata.org (accessed on 25 November 2024). [CrossRef]
Johns Hopkins University Center for Systems Science and Engineering (CSSE). COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. 2020. Available online: https://github.com/CSSEGISandData/COVID-19 (accessed on 25 November 2024).
Klepac, P.; Kissler, S.; Gog, J. Contagion! The BBC four pandemic—The model behind the documentary. Epidemics 2018, 24, 49–59. [Google Scholar] [CrossRef]

Figure 1. SEIRS (Susceptible, Exposed, Infected, Recovered, Susceptible) Model Diagram: Susceptible means healthy and asymptomatic but not immune; Exposed means contagious but asymptomatic; Infected means symptomatic but assumed isolated so not contagious; and Recovered means healthy and temporarily immune. Differential equation models are Markovian in that the number of individuals in each state at time

t + 1

depends on the parameter values (

β, σ, γ, η

) and the number of individuals in each state at time t, ignoring previous times.

Figure 1. SEIRS (Susceptible, Exposed, Infected, Recovered, Susceptible) Model Diagram: Susceptible means healthy and asymptomatic but not immune; Exposed means contagious but asymptomatic; Infected means symptomatic but assumed isolated so not contagious; and Recovered means healthy and temporarily immune. Differential equation models are Markovian in that the number of individuals in each state at time

t + 1

depends on the parameter values (

β, σ, γ, η

) and the number of individuals in each state at time t, ignoring previous times.

Figure 2. Using simulated data from the simulation engine EpiPolicy, these figures show the predictions from the two best methods ARIMA and EpiInfer. EpiInfer has a lower mean relative RMSE than ARIMA for 10- and 20-day predictions, while showing similar performance for 5-day predictions. Please see Table 1 for p-values and the confidence interval for each forecast length. (a) One-day predictions with 10–50 training days. (b) Five-day predictions with 10–50 training days. (c) Ten-day predictions with 10–50 training days. (d) Twenty-day predictions with 10–50 training days.

Figure 3. Using simulated data from sister project EpiPolicy, these are the relative RMSE (Root Mean Squared Error) data from the two best methods ARIMA and EpiInfer. EpiInfer has a lower mean relative RMSE than ARIMA for 10 and 20-day predictions, while showing similar performance for 5-day predictions. Please see Table 1 for p-values and confidence intervals. (a) One-day predictions with 10–50 training days. The mean RMSE is 0.042 for EpiInfer and 0.011 for ARIMA. (b) Five-day predictions with 10–50 training days. The mean RMSE is 0.165 for EpiInfer and 0.13 for ARIMA. (c) Ten-day predictions with 10–50 training days. The mean RMSE is 0.227 for EpiInfer and 0.29 for ARIMA. (d) Twenty-day predictions with 10–50 training days. The mean RMSE is 0.28 for EpiInfer and 0.55 for ARIMA.

Figure 4. Visual comparison for CoMix/JHU data predictions for 1–20 days into the future. ARIMA and EpiInfer are better than (lower Root Mean Squared Error than) LSTM and Differential Equation models regardless of the forecast horizon. ARIMA beats EpiInfer for short-term predictions of five days or less. EpiInfer beats all other methods beyond five days.

Figure 5. Using CoMix/JHU data, these are the predictions from the two best methods ARIMA and EpiInfer. EpiInfer has a lower mean relative RMSE than ARIMA for 10- and 20-day predictions, while showing similar performance for 5-day predictions. Please see Table 2 for p-values and confidence intervals. (a) One-day predictions with 10–50 training days. (b) Five-day predictions with 10–50 training days. (c) Ten-day predictions with 10–50 training days. (d) Twenty-day predictions with 10–50 training days.

Figure 6. Using CoMix/JHU data, these are the relative RMSE (Root Mean Squared Error) data from the two best methods ARIMA and EpiInfer. EpiInfer has a lower mean relative RMSE than ARIMA for 10- and 20-day predictions, while showing similar performance for 5-day predictions. Please see Table 2 for p-values. (a) One-day predictions with 10–50 training days. The mean RMSE is 0.089 for EpiInfer and 0.04 for ARIMA. (b) Five-day predictions with 10–50 training days. The mean RMSE is 0.22 for EpiInfer and 0.22 for ARIMA. (c) Ten-day predictions with 10–50 training days. The mean RMSE is 0.266 for EpiInfer and 0.46 for ARIMA. (d) Twenty-day predictions with 10–50 training days. The mean RMSE is 0.335 for EpiInfer and 1.29 for ARIMA.

Figure 7. EpiPolicy data experiment predictions of EpiInfer with full meeting distribution data compared with EpiInfer using the mean number of meetings only. Every parameter except

β

is the same as described in Section 4.2. The values of

β

for the three groups are 0.5, 0.9, and 0.15. Using the mean, for this set of

β

values, yields results that are nearly as good as using the full distribution. (a) One-day predictions with 10–50 training days. (b) Five-day predictions with 10–50 training days. (c) Ten-day predictions with 10–50 training days. (d) Twenty-day predictions with 10–50 training days.

Figure 7. EpiPolicy data experiment predictions of EpiInfer with full meeting distribution data compared with EpiInfer using the mean number of meetings only. Every parameter except

β

is the same as described in Section 4.2. The values of

β

for the three groups are 0.5, 0.9, and 0.15. Using the mean, for this set of

β

values, yields results that are nearly as good as using the full distribution. (a) One-day predictions with 10–50 training days. (b) Five-day predictions with 10–50 training days. (c) Ten-day predictions with 10–50 training days. (d) Twenty-day predictions with 10–50 training days.

Figure 8. Second EpiPolicy data experiment predictions of EpiInfer and EpiInfer with average number of meeting only. Every parameter except

β

is the same as described in Section 4.2. The values of

β

for the three groups are 0, 1.5, and 0. EpiInfer with the full data distribution performs better than the average only in most cases. (a) One-day predictions with 10–50 training days. (b) Five-day predictions with 10–50 training days. (c) Ten-day predictions with 10–50 training days. (d) Twenty-day predictions with 10–50 training days.

Figure 8. Second EpiPolicy data experiment predictions of EpiInfer and EpiInfer with average number of meeting only. Every parameter except

β

is the same as described in Section 4.2. The values of

β

for the three groups are 0, 1.5, and 0. EpiInfer with the full data distribution performs better than the average only in most cases. (a) One-day predictions with 10–50 training days. (b) Five-day predictions with 10–50 training days. (c) Ten-day predictions with 10–50 training days. (d) Twenty-day predictions with 10–50 training days.

Figure 9. Gamma meeting distributions experiment 20-day predictions of EpiInfer with full meeting distribution data compared with EpiInfer using the mean number of meetings only. Every parameter except

β

is the same as described in Section 4.2. The values of

β

are (0.4, 0.55, 0.75), (0.2, 0.5, 1.05), and (0.05, 0.35, 1.25) respectively. As the variance of the number of meetings increases, EpiInfer with full meeting distribution data performs increasingly better compared to EpiInfer using the mean number of meetings only. (a) Twenty-day predictions with 10–50 training days for

α = 12

and

θ = 1

. (b) Twenty-day predictions with 10–50 training days for

α = 2

and

θ = 6

. (c) Twenty-day predictions with 10–50 training days for

α = 1

and

θ = 12

.

Figure 9. Gamma meeting distributions experiment 20-day predictions of EpiInfer with full meeting distribution data compared with EpiInfer using the mean number of meetings only. Every parameter except

β

is the same as described in Section 4.2. The values of

β

are (0.4, 0.55, 0.75), (0.2, 0.5, 1.05), and (0.05, 0.35, 1.25) respectively. As the variance of the number of meetings increases, EpiInfer with full meeting distribution data performs increasingly better compared to EpiInfer using the mean number of meetings only. (a) Twenty-day predictions with 10–50 training days for

α = 12

and

θ = 1

. (b) Twenty-day predictions with 10–50 training days for

α = 2

and

θ = 6

. (c) Twenty-day predictions with 10–50 training days for

α = 1

and

θ = 12

.

Figure 10. CoMix/JHU data experiment predictions of EpiInfer and EpiInfer with average number of meeting only. There is no significant difference between estimates when using the full distributions versus estimates when using the average only. (a) One-day predictions with 10–50 training days. (b) Five-day predictions with 10–50 training days. (c) Ten-day predictions with 10–50 training days. (d) Twenty-day predictions with 10–50 training days.

Figure 11. Using simulated data from EpiPolicy, these are the relative Root Mean Squared Error (see definition in the text) data from EpiInfer when using one, similar (in meeting distribution) and different locales. EpiInfer using similar locales (EpiInfer-multi) shows the best overall performance. Please see Table 4 for p-values of the comparison between one and similar locales.

Figure 12. Visual comparison for EpiPolicy data predictions from 1–20 days into the future. EpiInfer-multi is consistently more accurate than EpiInfer on a single locale and is significantly more accurate than ARIMA starting with forecasting horizons of four days.

Figure 13. Using simulated data from EpiPolicy, these are predictions from the two best methods ARIMA and EpiInfer-multi using locales having similar meeting distributions. Please see Table 4 for p-values and confidence intervals. (a) One-day predictions with 10–50 training days. (b) Five-day predictions with 10–50 training days. (c) Ten-day predictions with 10–50 training days. (d) Twenty-day predictions with 10–50 training days.

Figure 14. Using simulated data from EpiPolicy, these are the relative Root Mean Squared Error data from the two best methods ARIMA and EpiInfer-multi with similar locales. EpiInfer-multi with similar locales has a lower mean relative RMSE than ARIMA for 5, 10, and 20-day predictions, while showing worse performance for 1-day predictions. Please see Table 4 for p-values and confidence intervals of the difference in RMSE of the best different forecasting methods. (a) One-day predictions with 10–50 training days. The mean RMSE is 0.0273 for EpiInfer-multi with similar locales and 0.011 for ARIMA. (b) Five-day predictions with 10–50 training days. The mean RMSE is 0.0639 for EpiInfer with similar locales and 0.13 for ARIMA. (c) Ten-day predictions with 10–50 training days. The mean RMSE is 0.1261 for EpiInfer-multi with similar locales and 0.29 for ARIMA. (d) Twenty-day predictions with 10–50 training days. The mean RMSE is 0.2216 for EpiInfer-multi with similar locales and 0.55 for ARIMA.

Table 1. Root Mean Squared Error (RMSE) results for EpiPolicy data include predictions from 1–20 days into the future. For a prediction value of f in the future, the methods use the data from day d and possibly previous days to predict day

d + f

. The results show that EpiInfer and ARIMA are the best two methods, with ARIMA better for short-term predictions (up to 7 days) and EpiInfer better afterward. The p-values of the difference are under 0.05 for up to 3 days in the case when ARIMA is better and starting at predictions 12 days in the future when EpiInfer is better. When the p-value is below 0.05, we show the confidence interval of the difference in the mean. Otherwise, we do not report the confidence interval (value NA).

Table 1. Root Mean Squared Error (RMSE) results for EpiPolicy data include predictions from 1–20 days into the future. For a prediction value of f in the future, the methods use the data from day d and possibly previous days to predict day

d + f

. The results show that EpiInfer and ARIMA are the best two methods, with ARIMA better for short-term predictions (up to 7 days) and EpiInfer better afterward. The p-values of the difference are under 0.05 for up to 3 days in the case when ARIMA is better and starting at predictions 12 days in the future when EpiInfer is better. When the p-value is below 0.05, we show the confidence interval of the difference in the mean. Otherwise, we do not report the confidence interval (value NA).

Days Predicted	DiffEq	LSTM	ARIMA	EpiInfer	p-Value EpiInfer vs. ARIMA	Confidence Interval
1	0.33277	0.09573	0.01070	0.04196	< $10^{- 5}$	(−0.04..−0.02)
2	0.63757	0.18064	0.03437	0.06985	0.0033	(−0.06..−0.02)
3	1.09561	0.24920	0.06351	0.10496	0.017	(−0.07..−0.01)
4	1.66645	0.30655	0.09570	0.13063	0.0906	NA
5	2.20159	0.35327	0.12914	0.16496	0.1191	NA
6	2.50284	0.39124	0.16296	0.19230	0.2077	NA
7	2.50845	0.42264	0.19636	0.21148	0.3473	NA
8	2.34101	0.44845	0.22872	0.21356	0.3569	NA
9	2.12217	0.46871	0.26021	0.21829	0.1659	NA
10	1.91056	0.48642	0.29065	0.22700	0.081	NA
11	1.73462	0.50015	0.31965	0.23986	0.0517	NA
12	1.57637	0.51113	0.34811	0.25326	0.0286	(0.01..0.18)
13	1.43706	0.52018	0.37555	0.26602	0.0174	(0.03..0.19)
14	1.31618	0.52729	0.40174	0.26977	0.0075	(0.04..0.22)
15	1.21238	0.53306	0.42775	0.28016	0.0038	(0.06..0.24)
16	1.12391	0.53706	0.45325	0.28652	0.0015	(0.08..0.26)
17	1.04897	0.54045	0.47779	0.27885	0.0007	(0.11..0.29)
18	0.98581	0.54295	0.50275	0.27620	0.0003	(0.13, 0.32)
19	0.93300	0.54480	0.52762	0.27601	< $10^{- 5}$	(0.16..0.35)
20	0.88881	0.54562	0.55176	0.28160	< $10^{- 5}$	(0.17..0.37)

Table 2. Root Mean Squared Error (RMSE) results for CoMix/JHU data predictions from 1–20 days into the future. For a prediction value of f in the future, the methods use the data from day d and possibly previous days to predict day d + f. The results show that EpiInfer and ARIMA are the best two methods, with ARIMA better for short-term predictions (5 days or fewer) and EpiInfer better afterwards. The p-values of the difference are under 0.05 for up to 3 days in the case when ARIMA is better and starting at predictions 8 days in the future when EpiInfer is better. Figure 4 shows the RMSE of ARIMA vs. EpiInfer for each forecasting value between 1 and 20. As for the simulated data, we show confidence intervals of the difference in the RMSE only when the p-value is under 0.05.

Days Predicted	DiffEq	LSTM	ARIMA	EpiInfer	p-Value EpiInfer vs. ARIMA	Confidence Interval
1	0.76267	0.10798	0.03950	0.0892	< $10^{- 5}$	(−0.07..−0.03)
2	1.18928	0.19909	0.07349	0.1501	0.0001	(−0.11..−0.05)
3	1.74547	0.27690	0.11782	0.188	0.0028	(−0.11..−0.03)
4	2.46983	0.34464	0.16704	0.2064	0.1145	NA
5	3.43477	0.40345	0.21813	0.2233	0.4402	NA
6	4.73146	0.45760	0.26570	0.243	0.3	NA
7	6.46551	0.50773	0.31143	0.2632	0.1364	NA
8	8.80375	0.55734	0.35559	0.2673	0.0335	(0.01..0.17)
9	11.96590	0.60656	0.40478	0.2681	0.0048	(0.05..0.22)
10	16.20543	0.65536	0.45727	0.266	0.0009	(0.10..0.29)
11	21.91845	0.70905	0.51471	0.2616	< $10^{- 5}$	(0.15..0.36)
12	29.63236	0.76246	0.57522	0.2576	< $10^{- 5}$	(0.21..0.43)
13	40.11380	0.82061	0.63871	0.2567	< $10^{- 5}$	(0.26..0.51)
14	54.31285	0.88236	0.70907	0.2605	< $10^{- 5}$	(0.32..0.58)
15	73.53797	0.94588	0.78389	0.2675	< $10^{- 5}$	(0.38..0.66)
16	99.57146	1.01677	0.86681	0.2786	< $10^{- 5}$	(0.43..0.74)
17	134.65453	1.09481	0.95776	0.2922	< $10^{- 5}$	(0.50..0.85)
18	181.84841	1.18354	1.05928	0.3066	< $10^{- 5}$	(0.56..0.96)
19	244.97248	1.27533	1.17193	0.3203	< $10^{- 5}$	(0.64..1.08)
20	328.45723	1.37127	1.29173	0.3351	< $10^{- 5}$	(0.72..1.21)

Table 3. Root Mean Squared Error (RMSE) results for flu data predictions from 7, 14, and 21 days into the future. The results show that EpiInfer is the best method throughout, followed by ARIMA. As the rightmost column shows, the p-value of the difference between EpiInfer and ARIMA is quite high at 7 days and below 0.05 only at 21 days.

Days Predicted	DiffEq	LSTM	GNN	Transformer	ARIMA	EpiInfer	p-Value
7	34.52	0.154	0.195	0.162	0.062	0.059	0.3507
14	269.84	0.302	0.318	0.312	0.198	0.165	0.0815
21	627.01	0.455	0.441	0.470	0.423	0.338	0.0495

Table 4. p-Value of comparison ofEpiInfer-multi vs single localeEpiInfer. Root Mean Squared Error (RMSE) results for EpiPolicy data predictions from 1–20 days into the future. For a prediction value of f days in the future, the methods use the data from day d and possibly previous days to predict day

d + f

. The results show that EpiInfer-multi (over multiple locales having a similar meeting distribution) and ARIMA are the best two methods, with ARIMA better for 1-day predictions and EpiInfer-multi is about equal to ARIMA for two days and then better for all longer forecasting horizons. The p-values of the difference are under 0.05 for up to 1 day in the case when ARIMA is better and starting at predictions 4 days in the future when EpiInfer-multi has a lower RMSE.

Table 4. p-Value of comparison ofEpiInfer-multi vs single localeEpiInfer. Root Mean Squared Error (RMSE) results for EpiPolicy data predictions from 1–20 days into the future. For a prediction value of f days in the future, the methods use the data from day d and possibly previous days to predict day

d + f

. The results show that EpiInfer-multi (over multiple locales having a similar meeting distribution) and ARIMA are the best two methods, with ARIMA better for 1-day predictions and EpiInfer-multi is about equal to ARIMA for two days and then better for all longer forecasting horizons. The p-values of the difference are under 0.05 for up to 1 day in the case when ARIMA is better and starting at predictions 4 days in the future when EpiInfer-multi has a lower RMSE.

Days Predicted	DiffEq	LSTM	ARIMA	EpiInfer	EpiInfer Multi-Locale	p-Value EpiInfer-Multi vs. ARIMA	p-Value EpiInfer-Multi vs. EpiInfer
1	0.33277	0.09573	0.01070	0.04196	0.0273	0.0018 (−0.03..−0.01)	0.029 (0..0.3)
2	0.63757	0.18064	0.03437	0.06985	0.0373	0.3675	0.0011 (0.01..0.05)
3	1.09561	0.24920	0.06351	0.10496	0.0463	0.1057	< $10^{- 5}$ (0.03..0.08)
4	1.66645	0.30655	0.09570	0.13063	0.0555	0.0062 (0.01..0.07)	0.0004 (0.04..0.11)
5	2.20159	0.35327	0.12914	0.16496	0.0639	0.0006 (0.03..0.10)	< $10^{- 5}$ (0.06..0.14)
6	2.50284	0.39124	0.16296	0.19230	0.0766	< $10^{- 5}$ (0.05..0.13)	< $10^{- 5}$ (0.07..0.16)
7	2.50845	0.42264	0.19636	0.21148	0.0912	< $10^{- 5}$ (0.06..0.15)	< $10^{- 5}$ (0.07..0.17)
8	2.34101	0.44845	0.22872	0.21356	0.1026	0.0001 (0.08..0.17)	0.0003 (0.06..0.16)
9	2.12217	0.46871	0.26021	0.21829	0.1143	< $10^{- 5}$ (0.09..0.2)	0.0008 (0.05..0.16)
10	1.91056	0.48642	0.29065	0.22700	0.1261	< $10^{- 5}$ (0.11..0.22)	0.0038 (0.05..0.16)
11	1.73462	0.50015	0.31965	0.23986	0.1383	< $10^{- 5}$ (0.12..0.24)	0.0033 (0.04..0.16)
12	1.57637	0.51113	0.34811	0.25326	0.1509	< $10^{- 5}$ (0.14..0.26)	0.004 (0.04..0.17)
13	1.43706	0.52018	0.37555	0.26602	0.1651	< $10^{- 5}$ (0.15..0.28)	0.0055 (0.04..0.17)
14	1.31618	0.52729	0.40174	0.26977	0.1761	< $10^{- 5}$ (0.16..0.29)	0.0159 (0.03..0.16)
15	1.21238	0.53306	0.42775	0.28016	0.1839	< $10^{- 5}$ (0.17..0.31)	0.0159 (0.03..0.17)
16	1.12391	0.53706	0.45325	0.28652	0.1978	< $10^{- 5}$ (0.18..0.33)	0.0219 (0.02..0.16)
17	1.04897	0.54045	0.47779	0.27885	0.2054	< $10^{- 5}$ (0.2..0.35)	0.0485 (0.00..0.14)
18	0.98581	0.54295	0.50275	0.27620	0.2121	< $10^{- 5}$ (0.21..0.37)	0.0746
19	0.93300	0.54480	0.52762	0.27601	0.2192	< $10^{- 5}$ (0.23..0.39)	0.1037
20	0.88881	0.54562	0.55176	0.28160	0.2216	< $10^{- 5}$ (0.24..0.42)	0.0937

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kascelan, J.; Yang, R.; Shasha, D. EpiInfer: A Non-Markovian Method and System to Forecast Infection Rates in Epidemics. Algorithms 2025, 18, 450. https://doi.org/10.3390/a18070450

AMA Style

Kascelan J, Yang R, Shasha D. EpiInfer: A Non-Markovian Method and System to Forecast Infection Rates in Epidemics. Algorithms. 2025; 18(7):450. https://doi.org/10.3390/a18070450

Chicago/Turabian Style

Kascelan, Jovan, Ruoxi Yang, and Dennis Shasha. 2025. "EpiInfer: A Non-Markovian Method and System to Forecast Infection Rates in Epidemics" Algorithms 18, no. 7: 450. https://doi.org/10.3390/a18070450

APA Style

Kascelan, J., Yang, R., & Shasha, D. (2025). EpiInfer: A Non-Markovian Method and System to Forecast Infection Rates in Epidemics. Algorithms, 18(7), 450. https://doi.org/10.3390/a18070450

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

EpiInfer: A Non-Markovian Method and System to Forecast Infection Rates in Epidemics

Abstract

1. Introduction

2. Related Work

2.1. Calibration

2.1.1. Gradient-Based

2.1.2. Gradient-Free

2.2. Models That Use Information About Meetings

2.2.1. Agent-Based Models

2.2.2. Contact Tracing-Based Models

2.3. Meeting Distribution-Based Models

3. Materials and Methods

3.1. EpiInfer-Core: Estimating Infection Rates from Basic Parameters

3.2. ContinuousCalibrate: Systematic Search for p 1 and p 2

3.3. Comparison with the Differential Equation Based Model

4. Single Locale Experiments

4.1. Other Methods

4.2. Method Comparison on Simulation Data

5. Experiments with CoMix/JHU Data on COVID-19

6. Experiments with BBC Data on the Flu

7. Is Infection Prediction Sensitive to the Precise Values of Transition Parameters

8. Using Average Number of Meetings Instead of Meeting Distribution

8.1. EpiPolicy Data Regular Meeting Distribution Experiment

8.2. EpiPolicy Data Bimodal Meeting Distribution Scenario

8.3. EpiPolicy Data Gamma Meeting Distribution Experiment

8.4. CoMix/JHU Mean-of-Meetings Experiment

9. Multiple Locales

9.1. Multilocale Inference

9.2. Multilocale Experiment with EpiPolicy Data

10. Potential Deployment Scenarios

Scalability and Large-Scale/Real-Time Applications

11. Conclusions and Future Work

Future Work

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Comparison with Transformer and Graph Neural Net on Real and Simulated Data

Appendix B. Sensitivity of Infection Prediction to Parameter Estimation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.2. ContinuousCalibrate: Systematic Search for $p_{1}$ and $p_{2}$