Non-parametric model-based estimation of the effective reproduction number for SARS-CoV-2

Viral outbreaks, such as the current COVID-19 pandemic, are commonly described by compartmental models by means of ordinary differential equation (ODE) systems. The parameter values of these ODE models are typically unknown and need to be estimated based on accessible data. In order to describe realistic pandemic scenarios with strongly varying situations, these model parameters need to be assumed as time-dependent. While parameter estimation for the typical case of time-constant parameters does not pose larger issues, the determination of time-dependent parameters, e.g.~the transition rates of compartmental models, remains notoriously difficult, in particular since the function class of these time-dependent parameters is unknown. In this work, we present a novel method which utilizes the Augmented Kalman Smoother in combination with an Expectation-Maximization algorithm to simultaneously estimate all time-dependent parameters in an SIRD compartmental model. This approach only requires incidence data, but no prior knowledge on model parameters or any further assumptions on the function class of the time-dependencies. In contrast to other approaches for the estimation of the time-dependent reproduction number, no assumptions on the parameterization of the serial interval distribution are required. With this method, we are able to adequately describe COVID-19 data in Germany and to give non-parametric model-based time course estimates for the effective reproduction number.

A common approach to capture the dynamics of a viral disease spread is to utilize the classical Susceptible-Infected-Recovered (SIR) compartmental model and to estimate its parameters from recorded data [Capaldi et al., 2012, Kermack and McKendrick, 1991, Chen et al., 2021].These model parameters are typically unknown and differ for every disease, depending on the pathogen and its contagiousness [Bani-Yaghoub et al., 2012, Tönsing et al., 2018].For many infectious diseases, extensions of the classical SIR structure have been proposed in order to yield a more appropriate description of the infection dynamics.Such adaptations commonly distinguish between deceased (D) and recovered (R) individuals or add an intermediate exposed compartment (E) in which individuals are infected but not yet infectious.Introducing even more compartments and interactions in such models might add detail to the description of the infection dynamics, however, parameter estimation in overly complex models is hampered, because of the limited amount of available data.Furthermore, such models typically suffer from parameter-identifiability issues due to over-parameterizations [Wieland et al., 2021].
Many SIR-like model structures have been proposed and investigated for the COVID-19 disease [Barbarossa et al., 2020,Massonis et al., 2020,Godio et al., 2020,Raimúndez et al., 2021].Comprehensive model structures aim to cover all relevant features of the SARS-CoV-2 pandemic, such as for example asymptomatic infections, undetected infections or different hospitalization states [Khailaie et al., 2021].Despite the general interest in each of the individual parameters in such detailed models, an essential measure for the transmission potential of a disease in all epidemic models is the effective reproduction number R t , which is given by the average number of secondary infections caused by an infectious individual.It may vary over time, yields information about the current status of an outbreak, and can for example indicate the turning point of an exponential growth of infections.Its estimation from data can be achieved via epidemic models with time-dependent parameters and has been a key task for scientists to inform e.g.governmental decision makers during the SARS-CoV-2 pandemic.
In many countries, time series of the SARS-CoV-2 infection numbers showed multiple epidemic waves [Rahmandad et al., 2021, Prodanov, 2021], similar to temporal changes in the infection numbers as e.g. for seasonal influenza [Yaari et al., 2013].To capture these dynamics of the recorded infection case numbers in an SIR-like model, the model parameters need to be time-dependent [Jo et al., 2020, Hong and Li, 2020, Kolokolnikov and Iron, 2021].These time-dependencies of e.g.infection rate parameters not only cover seasonal effects [Capistrán et al., 2009, Merow andUrban, 2020], but also behavioral changes in the population and the impact of other non-pharmaceutical interventions.The latter might be modeled straightforwardly by a specific step-function which lowers the infection-rate parameter e.g. on the date when governmental interventions come into force [Dehning et al., 2020].Such models enable the comparison of non-pharmaceutical intervention strategies, but highly depend on the chosen class of time-dependent functions and are less suitable to cover behavioral adaptions in the population.Choosing the appropriate function classes for these time-dependent model parameters and inferring them from recorded data is closely related to what is generally called input estimation for ordinary differential equation (ODE) models, a task that still needs to be solved in particular for non-linear high-dimensional systems [Kaschek and Timmer, 2012, Engelhardt et al., 2016,Villaverde et al., 2019,Kreutz, 2020].In contrast, in the context of infectious disease models, a recently published approach does not require such a particular function class definition, but instead is able to estimate the time dependency for one single model parameter by means of a Kalman Smoother method [Arroyo-Marioli et al., 2021].
The here presented work extends the function class-free approach for time-dependencies of all dynamical parameters in an SIRD model.The method consists of an Augmented Kalman Smoother (AKS) in combination with an Expectation-Maximization (EM) algorithm for the estimation of multiple time-dependent model parameters solely from incidence data.In particular, our approach does not require any additional prior knowledge about the remaining model parameters or total and initial numbers in the individual population compartments.We apply the algorithm to the SARS-CoV-2 incidence data from winter 2020 to summer 2021 in Germany and are able to estimate the time-dependencies of all model parameters as well as reporting online and real-time estimates of the time-dependent effective reproduction number R t .

SIRD model and reparameterization
The spread of an infectious disease within a population is frequently described by the SIR model [Kermack and McKendrick, 1991] consisting of the three compartments: Susceptible S, infected I and removed R. The third compartment is split up into recovered R and deceased D resulting in the SIRD model [Keeling and Rohani, 2011] whose time-continuous evolution is given by the following system of ODEs: (1) The SIRD model comprises three time-constant transition rate parameters β, γ and θ which correspond to the infection rate, recovery rate and mortality rate, respectively.The initial values of the four compartments at t = 0 are denoted by S(0) = S 0 , I(0) = I 0 , while R(0) = 0 and D(0) = 0 so that the total number of individuals is N 0 = S 0 + I 0 .At the beginning of an epidemic, i.e. when S 0 ≈ N 0 , the basic reproduction number R 0 of the disease, corresponding to the average number of secondary cases, can be calculated directly from the model parameters as R 0 = β/(γ + θ).During the course of an epidemic, seasonal effects for the viral transmission and behavioral changes in the population, for example increased hygiene standards or other non-pharmaceutical interventions, are reflected by time-dependencies in the infection rate β.Such effects may also result in epidemic scenarios with multiple waves of infections, as observed e.g. for seasonal influenza or COVID-19.In order to account for temporal changes in the model parameters, the transition rates β t , γ t and θ t are assumed to vary over time.This further allows for the definition of a time-dependent effective reproduction number [Nishiura and Chowell, 2009] which depends on the change of susceptibles S(t) over time and on the time courses of the three parameters β t , γ t and θ t .Since for R t > 1 or R t < 1 the number of infected persons is increasing, or decreasing, respectively, the effective reproduction number is a characteristic measure for the current status of an epidemic.To simplify the system for the estimation of R t , we substitute Equation (2) into Equation (1), yielding the reparameterized form of the SIRD model: (3) Note that by means of this transformation, the compartment of susceptibles S is no longer appearing on the right-hand-side of Equation ( 3), therefor the differential equation for S could be neglected without affecting the solution of the remaining equations.
Parameter estimation in non-linear ODE models with time-constant parameters is typically tackled by efficient numerical optimization algorithms that minimize the discrepancy between recorded data and model trajectories by tuning the parameter values, e.g. by maximumlikelihood estimation [Raue et al., 2013, Raue et al., 2015, Stapor et al., 2018, Loos et al., 2018, Kaschek et al., 2019].In the presented setting of changing transmission rates during an infectious disease outbreak, also the time-dependency of the model parameters itself is unknown and needs to be estimated from data.This task is known as input estimation for ODE models in general.However, most methods are limited in their restriction to a certain function class for these time-dependent model parameters, e.g.step-functions, splines or combinations of sustained and transient response functions [Kreutz, 2020, Schelker et al., 2012].In contrast, we present a function class-free approach based on Kalman Filter and Smoother methods that is able to estimate the dynamics of unobserved model states and after some extensions also the time-dependencies of the parameters.

Estimation of time-dependent parameters in non-linear ODE models via the Augmented Kalman Smoother
The Kalman Filter is an iterative algorithm that is used to estimate time courses of unobserved states of a linear model from noisy observations over time.The corresponding Kalman Smoother is a two-step algorithm that uses the Kalman Filter as forward-pass followed by a backwards-pass procedure of the same kind.The step-wise linearizations of the Kalman Filter and Smoother for non-linear models is covered by the so-called Extended Kalman Filter and Smoother.Ultimately, the Augmented Kalman Smoother (AKS) combines the latter with the ability to consider timedependent parameters.

State-space model
The Kalman Filter and its extensions operate in the time-discrete state-space representation of a given model.It has the general form for discrete time points k of the n-dimensional process state vector x k and the d-dimensional vector of observables y k , with n > d.The transition of the state vector x from time point k to k+1 is given by the transition function b with assumed additive Gaussian normally distributed process noise k ∼ N (0, Q) with covariance matrix Q.Since not all states are observed directly, the observation matrix H defines the mapping between the state vector x and the observable vector y.Likewise to the process states, additive Gaussian distributed measurement noise η k ∼ N (0, R) with covariance matrix R is assumed for the observations y k .
The model structure of a given ODE model ẋ = f (x, p) with m parameters p can be transferred into its time-discrete state-space formulation, where the solutions of the ODE correspond to the states x of the state-space and the right-hand side f of the ODE is translated into timediscrete update rules for the transition function b.Furthermore, the observation matrix H needs to be specified in order to define the link from the data in vector y to the respective states from vector x.

Augmented Kalman Smoother
For the estimation of unobserved process states in Equation (4) in non-linear models, the Extended Kalman Smoother is used [Kalman, 1960].It assumes time-constant parameters and requires all model parameter values as input.In order to estimate time-dependent parameters, we utilize an extension of the Extended Kalman Filter and Smoother algorithms and call it Augmented Kalman Smoother (AKS) in correspondence to the previously reported Augmented Kalman Filter [Carrassi andVannitsem, 2011, Sun et al., 2008].It essentially follows the formulation of the Extended Kalman Smoother (EKS) with the distinction of introducing additional process states for each time-dependent model parameter.It therefore only requires appropriate initial values of the states and parameters as input that are subject to change over time.As a consequence, the AKS not only outputs the estimates for the time courses of the unobserved process states, but also yields estimates for the time courses of the time-dependent model parameters.Moreover, the AKS method estimates the covariance matrices P of the state-space vector x of the analyzed process in the same iteration and is initialized at time point k = 1.The process state vector x 0 and the process state covariance matrix P 0 at time point k = 0 need to be provided.
The first step of the algorithm is the so-called prediction step Here x k|k−1 and P k|k−1 correspond to the predicted state vector and covariance matrix at time k based on the k − 1 previous data points.Thus, the first predicted state vector x 1|0 is calculated from the transition function b applied to the state vector x 0 .As part of the step-wise linearization, the Jacobian of b at point x k−1|k−1 is calculated as The second step of the algorithm is the update step, where the predicted state vector and covariance matrix are updated using the noisy measurement y k : The Kalman gain K k is calculated from the the measurement covariance matrix R and the predicted covariance matrix P k|k−1 of the process.Since it can be interpreted as the relative weight between the model predictions and the noisy measurements, a large Kalman gain implies that more weight for the next prediction is placed on the measurements.In such a case, the uncertainties in the model predictions are considered to be larger than the measurement uncertainties and vice-versa.The predictions are in turn updated according to Equation ( 9) and (10).
To close the cycle of the algorithm, the updated state vector x k|k and covariance matrix P k|k are used as input for the following prediction step.This cycle is repeated until the end of the observed time series is reached.Equations ( 5)-(10) define the filter, which performs a forward estimation of state-space vector x and its covariance matrix P .The shortcoming of the filtering method is that the estimate x k|k is only based on the k − 1 previous data points.In consequence, the filter's estimates at the beginning of the time series are based on less information than those at the end.Furthermore, the estimates always lag one step behind the data.To circumvent this issue, the filter is typically combined with a smoother, which is commonly also called Rauch-Tung-Striebel Smoother [Rauch et al., 1965].While the filter runs along the data from k = 1 to k = N , the smoother runs backwards from k = N − 1 to k = 0.It is initialized with the last updated estimates of the filter x N |N and P N |N .It does not directly use the noisy measurements y k but rather compares the estimates and predictions that result from the filter to generate smoothed estimates at each time point k which are based on all available data: Here, B k is the smoother gain which has a similar function to the Kalman gain K k in Equation (8).As a final result, the AKS algorithm outputs the state covariance matrices P k|N and the smoothed estimates for the state vector x k|N that correspond to the trajectories of the model states as well as to the time courses of the model parameters.

Expectation-Maximization Algorithm for AKS Initial Parameters
In a Kalman Smoother algorithm, appropriate initial values for x 0 , P 0 , Q and R need be provided.
In order to provide optimal starting conditions, an Expectation-Maximization (EM) algorithm can be applied.The expectation step (E-step) consists of the AKS algorithms.Based on the smoothed estimates x k|N and P k|N , maximum-likelihood estimation is performed for Q and R, wherefore update equations for these covariance matrices are obtained by maximizing the following target function [Dreano et al., 2017] with respect to Q and R: It can be analytically shown that the maximum is reached for serving as input for the next EM-iteration [Dreano et al., 2017].In addition, the values for x 0|N and P 0|N obtained from the AKS of the previous iteration are taken as initial values x 0 and P 0 of the next iteration corresponding to the maximization step (M-step) of the EM algorithm.
These steps are repeated until the convergence criterium reaches a certain level with α denoting the relative tolerance for the difference between successive estimates of the hyperparameters Q, R, x 0 and P 0 .Within the combined application with the AKS, an initial guess for these hyperparameters needs to be provided to the EM algorithm.In the presented analyses, the choice of these values is not critical for the final results.

From the time-continuous ODE system to its time-discrete state space formulation
Although within this paper, we apply the presented Kalman Smoother method only to epidemic models, it is generally applicable to partially observed non-linear ODE systems, as is shortly outlined in the following.Let us consider the following time-continuous dynamic ODE model Here, the time derivative of the state variables x is given as a function f depending on the states vector itself and the parameter vector p.The initial conditions of the ODE system can be interpreted as additional parameters in p.
For the time-discrete formalism, the ODE's right-hand-side f (x, p) is interpreted as an update equation for each discrete time step from x k−1 to x k , such that Equation ( 17) turns into The parameters p k are treated as additional states in the state space which are not updated according to the ODE model structure, reflected by the zero term in the equation.Instead, the time-dependence originates only from the additive noise term k−1 which is simultaneously estimated by the Kalman Smoother in each step.
It turns out that this additional degree of freedom enables our Kalman Smoother-based approach to adapt these parameters according to the available data.Thus, by construction it achieves a data-based estimation of time-dependent parameters in the ODE system without any restriction on the parameter trajectory.

SIRD-AKS method for estimating the effective reproduction number R t
Based on its general derivation in Equation ( 18), the time-discrete state-space formulation of the SIRD model from Equation (3) reads where the first four rows correspond to the four states of the time-continuous ODE, and k−1 is the process noise as introduced in Equation ( 4).The desired time-dependence of the three SIRD model parameters R k , γ k and θ k is introduced for the AKS method by treating the model parameters as additional states in the state-space formulation which is henceforth called SIRD-AKS.The estimates from the SIRD-AKS thus yield time courses for the model states S k , I k , R k and D k as well as for the parameters R k , γ k and θ k .
In a real-world scenario of an outbreak, there is typically no data available for the value of the currently infectious individuals in state I per day.Instead, the so-called incidence, i.e. the number of newly infected cases per time period, is better accessible and thus usually reported.In the SIRD model, these incidence numbers correspond to the influx in the infectious state to the respective fluxes.Thus, the fluxes v I,k , v R,k and v D,k serve as observables y k for the SIRD-AKS method.However, the observation matrix H in Equation ( 4) only allows for a linear relationship between observables and states.In order to be able to map the data to the appropriate variables of the state-space for the SIRD-AKS method, again the state-space formulation needs to be expanded by three additional states that correspond to the three fluxes, yielding and observation matrix that links the incidence data to the respective fluxes fluxes v I,k , v R,k and v D,k .Based on incidence data, the SIRD-AKS method is applied to the state-space formulation Equation (20) and yields estimates for the time courses of the state-space vector x k .Its entries are then interpreted as discretized time series of the model states, time-dependent parameters and fluxes, i.e. incidence numbers.Since all variables are strictly positive, all analyses are performed on logarithmic scale for x k , which is beneficial for numerical stability.

Incidence-based reproduction number calculation method
In contrast to the formulation of the effective reproduction number R t as a time-dependent parameter in the reparameterized SIRD model and its estimation using e.g. the SIRD-AKS method, the reproduction number can also be calculated directly from incidence data [Cori et al., 2013].In its original formulation in [Cori et al., 2013], this incidence-based method requires not only incidence data, but also the empirical serial interval distribution, i.e. information about the time period between illness onset in an infected case and illness onset in a consecutive case.Since reliable data on this was not available in the beginning of the SARS-Cov-2 pandemic, the German federal center for disease control and prevention, i.e. the Robert Koch Institute (RKI), chose to use the incidence-based method from [Cori et al., 2013] and assumed a Dirac δ-distribution at s = 4 days for the serial interval distribution [Hamouda et al., 2020].This assumption of one single applicable value for the serial interval time s leads to an easily interpretable calculation rule for the effective reproduction number as the ratio between the number of newly infected individuals n I at time point t and at time point t − s.To account for fluctuations in the reported incidence data such as week-day dependencies, a moving average over τ days is considered that leads to It should be noted, that two different R s,τ t -values were published on a daily basis by the RKI, both with the described assumption of the /delta-distributed serial interval at s = 4 days: (i) A so-called sensitive 4-day R 4,4 t value, and (ii) a more stable 7-day R 4,7 t , depending on the averaging window length.
The advantage of this incidence-based method is that it does not require the explicit formulation of a compartment model, its numerical evaluation or any estimation of parameters.Furthermore, a certain value of for example R t = 2 becomes easily interpretable and verifiable also for non-experts, since it corresponds to a doubling in the numbers of newly infected individuals within s = 4 days.As a consequence, the daily updated value of R t calculated from this incidence-method was used by the RKI to inform the public as well as political decision makers about the status of the pandemic situation and thus had a decent influence on the installed non-pharmaceutical interventions [Hamouda et al., 2020].
While being very simple to calculate and easy to interprete, the method has the disadvantage that the R t value and its interpretation strongly depends on the appropriateness of the chosen δ-distributed serial interval, as it is directly associated with the time period for the doubling times.In contrast, the presented SIRD-AKS method does not require any assumptions on the serial interval distribution.

Simulation setting
To show the performance of the presented AKS method for the estimation of time-dependent parameters in nonlinear ODE models, we first applied the algorithm to simulated data sets from various scenarios.In each simulation scenario, a different combination of model structure and values of the time-constant parameters or time courses of the time-dependent parameters were chosen.The ODE system was then solved for given initial conditions by using the R package deSolve [Soetaert et al., 2010].In order to generate realistic incidence data for the respective quantities, the fluxes v I , v R and v D of the ODE system were discretized and relative Gaussian noise with zero mean and given standard deviation was added.In the initial examples, different parameter time courses and noise levels were chosen in an SIRD-model for data generation, followed by scenarios that consider data sets generated by non-SIRD models.Ultimately, the method was applied to officially reported incidence data for SARS-CoV-2 in Germany.
Note that all noisy incidence data sets were analyzed by the presented SIRD-AKS method, i.e. based on the reparameterized SIRD model from Equation (20), regardless of the chosen model structure for data generation.Further, no information about true values, time-dependency or function class of the model parameters were provided to the SIRD-AKS algorithm, but only the respective incidence data sets.Then, the resulting estimates of the SIRD-AKS algorithm for model states, fluxes and parameter time-courses were compared to the respective true quantities derived from the solutions of the data generating ODE models.In particular, the accordance of the estimated time-dependent effective reproduction number R t and its true time course was analyzed.

AKS performance for multiple time-varying SIRD model parameters
As a first simulation setting, the classical SIRD model from Equation (1) with the initial values S 0 = 10 8 , I 0 = 1 was used for data generation.Furthermore, three different parameter settings for the three model parameters β t , θ t and γ t were simulated.In the first parameter setting shown in Figure 1A, time-varying parameters were chosen as β t = 0.05 + 0.04 • sin (t/50) and θ t = 0.003 + 0.0015 • sin (t/33), while γ = 0.02 was chosen to be constant over time.The resulting dynamics for R t reflect the combination of the sinusoidal input β t and the decrease of susceptibles S, c.f. Equation (2).To generate data sets, the fluxes v I = β t SI N 0 , v D = θ t I and v R = γ t I between the model's states were evaluated at 375 equidistant time points, mimicking approximately one year of daily observations.Gaussian noise with a relative error of 5% was added to the three time series of observables, shown as green dots in Figure 1.
Next, the noisy data was provided to the SIRD-AKS in order to estimate model states and time-dependent parameters.For this, the EM-algorithm part of the SIRD-AKS was initialized with 6) and the convergence threshold was chosen to be α = 0.001.The estimation results are shown as black lines in Figure 1A, where the grey error bands of the estimates correspond to the estimated covariance matrices P k|k .Note that the time-dependent parameters are shown on logarithmic scale throughout this work.The SIRD-AKS is able to simultaneously capture the dynamics of the simulated values for parameters, states and observations.In particular, the time-dependent parameters γ t , R t and θ t were obtained without providing any information on their function class or dynamical behavior.It can be observed that the grey error bands for the estimated parameters in Figure 1 are larger in the beginning of the time series, where also the parameter estimates slightly deviate from the true values due to missing dynamics in the data.In contrast, with increasing dynamic activity in the data, the error bands become smaller and the estimates coincide with the true values.
To analyze whether the SIRD-AKS can also handle different time-dependencies of the model parameters, for example rapidly changing parameters including discontinuous functions over time, two further parameter settings were addressed as shown in Figures 1B and C. For this, the SIRD model was evaluated with time-constant parameters γ t and θ t and a time-varying β t .In parameter setting II, the time course of β t was chosen as an autoregressive process of order p = 1 (AR[1]) with an added exponential decay.The autoregressive process is given by x i = 0.0022+0.95x i−1 +w i with noise term w i ∼ N (0, 0.004).Furthermore, a step function with three constant levels was utilized in parameter setting III.Data was simulated (not shown) and the SIRD-AKS algorithm was applied analogously to the previous setting I with sinusoidal β t .For all parameter settings I, II and III, the SIRD-AKS results adequately capture the dynamics of the three time-dependent parameters, demonstrating a broad applicability of the method to different function classes of time-dependent parameter time courses.

Performance for high noise levels
To investigate how robust the SIRD-AKS method performs with respect to possibly high observation noise, we again considered parameter setting II with the stochastic process for R t .In parallel, four data sets were simulated with different noise levels as shown in Figures 2A-D.The parameter time courses and initial values for the states were chosen to be the same for all these scenarios.This is reflected by the true value of parameter R t that shows the same trajectory in all four scenarios, represented by the red line in Figure 2. In contrast, the simulated data for the observed fluxes expectedly shows higher variations for higher noise levels, shown in green in Figure 2.
The SIRD-AKS method was applied to all four scenarios and its estimates are shown as black a time-constant parameter γ and time-dependent parameters βt and θt.From these parameters, the effective reproduction number Rt was calculated via Equation (2).True parameter time courses and states are depicted as red solid lines and red dashed lines, respectively.Thereof, data was simulated at 375 equidistant time points for the observed fluxes vD, vI and vR as indicated by green dots.The SIRD-AKS is based on the reparameterized SIRD model and thus does not yield estimates for βt or S but instead directly for Rt.The estimates of the SIRD-AKS are shown as black solid lines with gray bands for their uncertainties.Results are shown on logarithmic scale.The results of the the SIRD-AKS algorithm are able to describe the data and are in good agreement with the true parameter time courses, both for the time-constant and time-dependent parameters.Note that the SIRD-AKS estimates for the states overlap with the true time courses of the model states.B-C: In different parameter settings, simulation was performed with time-constant parameters γ and θ, whereas βt was either chosen as a stochastic process (B), or as a step function (C), respectively.The corresponding values for Rt were computed using Equation (2).In both cases, the SIRD-AKS results retrieve the true parameter time courses adequately.lines in Figure 2. Firstly, it can be seen that the observations are adequately described by the SIRD-AKS which follows the data points also when they rapidly vary over time as in Figure 2D.Secondly, the correct trajectory of the time-dependent parameter R t as well as the correct levels of the time-constant parameters γ and θ were obtained in all four scenarios.However, the higher the noise level, the more it becomes evident that the SIRD-AKS estimates fluctuate around the true values of the parameters, while the average form of the parameter time course is still in accordance with the time course for the true value of R t .Taken together, these results show that the SIRD-AKS results for the estimates of the time-dependent parameters remain correct, even for relatively high noise levels in the data.

Influence of potential model misspecifications
Up to this point, we have only shown applications of the SIRD-AKS method to simulated data from the same SIRD model structure.In more realistic applications, however, the true model structure is not known, or as in all real-world scenarios, the chosen model structure is only an appropriate simplification of the real underlying biological process.Therefor, we next investigated the performance of the SIRD-AKS method in the light of model misspecifications.
To this aim, data was generated from non-SIRD model structures, but analyzed by the AKS method using the inappropriate SIRD model structure.We tried to reconstruct the time courses of the effective reproduction number in two examples of such intentionally misspecified model structures for the SIRD-AKS method.
First, the so-called SEIRD model was considered for data generation.It constitutes an extension of the SIRD model augmented by a model state of Exposed (E) corresponding to individuals which are infected but not yet infectious.This extension is commonly considered when the disease progression underlies an incubation period.The ODE system of the SEIRD model reads where the reciprocal of the additional parameter δ −1 is called latency period.Note that the limiting case of the SEIRD model for δ −1 → 0 is the SIRD model, while for higher values of δ −1 the difference between the two models increases and distinct dynamics are expected.To illustrate this transition, four different analyses with latency periods varying from 4 to 21 days were performed.
The SEIRD model Equation ( 24) was simulated for data generation with time-dependent parameters β t and θ t as well as time-constant parameter γ t , following the previous parameter setting I. Again, the true values for the reproduction number R t were calculated from the model parameters for the SEIRD model [ Van den Driessche, 2017].To mimic a realistic application of the presented method in an epidemic scenario, where the true underlying process is not fully known, data from the SEIRD model was analyzed by the SIRD-AKS and only the daily numbers of newly infected v I , newly recovered v R and newly dead v D were provided as observations to the algorithm.In consequence, the SIRD-AKS is able to estimate the SIRD model states and parameters, but cannot infer the transition rate δ or the exposed state E.
The resulting time courses of the SIRD-AKS estimates are shown in the upper part of Figure 3 with increasing values for the latency period δ −1 from panel A to D. Comparing the four simulation scenarios, the different latency periods are reflected both in different dynamics   24) was simulated with time-varying parameters βt and θt and for different latency periods increasing from δ −1 = 4d (A) to δ −1 = 21d (D).Calculated time courses for Rt depend on the latency period parameter and thus changes in the four scenarios.Simulated data was generated from the SEIRD model fluxes with 5 % relative error.The data was analysed by the SIRD-AKS algorithm, which assumes the reparameterized SIRD model as truth and, in particular, does not contain an Exposed state E. Despite the model misspecification between the data generating process and the analyzing model, the SIRD-AKS estimates for the time-dependent parameters are accurately met for shorter latency periods in Scenario A-B.For larger deviations from the assumed SIRD model structure, the AKS results are slightly biased towards values of one (C-D), while the overall dynamics is retrieved sufficiently.
of the observables and in the time courses of the reproduction number R t .As expected, the peak in the observed infections appears later for higher latency periods corresponding to longer times that individuals remain in the exposed state E.
For all scenarios, the SIRD-AKS was applied to the SEIRD data sets, as shown in black in Figures 3A-D.It can be seen that the time courses of the parameters γ t and θ t are throughout accurately obtained.Interestingly, the estimates for the effective reproduction number R t are also very close to the true values for low to medium latency periods.In contrast, for higher latency periods, the estimates for R t are slightly shifted to later time points and also biased towards the characteristic value of R t = 1.In total, this shows that the AKS method is able to cope with moderate deviations in the model structure, while more extreme model misspecifications, e.g.latency periods of three weeks, can noticeably influence the result of the estimated time-dependent parameters.Especially, for COVID-19 such a model misspecification does not negatively impact the SIRD-AKS estimation results, since typically a latency period of 5-6 days was reported [Xin et al., 2021].
In order to formulate a model-misspecification problem that is closer to a real-world scenario, we additionally utilized a larger and more complicated model structure that covers a multitude of additional features.For this, we employed the so-called SECIR model that consists of sixteen model states and has been developed for a detailed description of the SARS-CoV-2 epidemic in Germany in 2020, including for example several infected carrier states as well as diverse hospitalization stages [Khailaie et al., 2021].This model was used for data generation by using parameters values within the allowed parameter ranges as in the original publication.The only deviation is that we replaced the originally estimated time course of the transmission parameter R 1 by a simple five-step function.This parameter mainly influences the time course of the true value of the effective reproduction number R t within the SECIR model, which was calculated according to the respective equation in the original publication and is shown as a reference in red in Figure 4. Within the chosen parameter set for the simulation, the resulting true R t time course covers a broad range of values, in particular below and above the characteristic value of R t = 1.The generated data sets for the newly infected v I , newly recovered v R and newly dead v D were again provided as observations of the SIRD-AKS analysis.The estimated time course of the effective reproduction number R t from the SIRD-AKS with SECIR data is shown as black line in Figure 4. Except for the level of the first plateau until t = 80, the R t value from the SIRD-AKS method shows a similar shape as the true R t time course and yields almost correct levels on the other plateaus.However, the estimated R t time course is slightly biased to R t = 1 after t = 80 days.
For a comparison of the different methods, Figure 4 also shows the results for the reproduction number R s,τ t calculated by the incidence-based method from Equation ( 23) for different values of the fixed serial interval time s = 2, ..., 10 days in unit steps, and with averaging windows τ = 4 days.It can be clearly seen that the choice of the fixed serial interval time heavily influences the resulting levels of the reproduction number.The Robert Koch Institute (RKI) as German center for disease control provided officially-issued reproduction number values during the SARS-Cov-2 pandemic for Germany based on this method with a chosen fixed serial interval value of s = 4 days.The accordingly calculated R 4,4 t value for the SECIR data set is highlighted in blue in Figure 4. Comparing blue and red line, it can be concluded that at least for the simulations of the SECIR model and the chosen parameterization, a fixed serial interval of s = 4 days is not an adequate choice.While the RKI-issued value roughly covers the general form of the time course, it is noticeably biased towards the characteristic value of R t = 1.As a consequence, the rate of increase or decay in newly infected individuals within an ongoing disease outbreak would be definitely underestimated.
An appropriate choice of the serial-interval distribution or at least an optimal choice of the fixed serial interval time s would solve this issue, as can be seen from the grey lines in Figure 4.However, it should be noted that the selection of the optimal value for s is difficult in real- world applications and requires additional data or assumptions.That is, the simplicity of the calculation and interpretation using the incidence-based method comes at the price of required assumptions for the serial interval times.
Some deviations from the true R t value also appear in the results of the SIRD-AKS method.However, we surmise that these originate from the misspecified model structure.We argue that a clear advantage of the SIRD-AKS method is that it does not rely on additional hyperparameters or distributional assumptions, such as the incidence-based method does on the serial interval time.This renders the SIRD-AKS method to be a de facto non-parametric approach for the estimation of the reproduction number and model parameters in general.As shown in this example, the SIRD-AKS method also yields reasonable results, when the internal state space model does not meet the model structure of the data generating process.It therefore qualifies for the application to real-world data sets, where the true underlying process is masked and thus the optimal choice of an appropriate fixed serial interval time for the incidence-based method becomes even more difficult.

Application to SARS-CoV-2 data from Germany
After demonstrating the performance of the SIRD-AKS method for multiple parameter time courses and model misspecifications in simulated data sets, the method was applied to COVID-19 data from Germany for January 2020 until August 2021.The data was taken from the COVID-19 Data Repository maintained by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University [Dong et al., 2020], and is displayed in Figure 5A as green dots.Data pre-processing was performed with a moving average over seven days in order to reduce week-day effects and reporting delays in the raw data.
Figure 5A shows the analyzed data and estimated time courses of the observed fluxes for the newly infected v I , newly recovered v R and newly dead v D .The estimated time courses for the recovery rate γ t and death rate θ t are shown in Figure 5B.The fluctuations in the death rate are in line with the time periods when more old people had been reported to get infected, i.e. from April to June and from November to February 2020 [Staerk et al., 2021].This also agrees with previous studies showing that the mortality of COVID-19 is heavily linked to the patients age [Yanez et al., 2020, Ho et al., 2020].
The estimated time-dependent reproduction rate R t from the SIRD-AKS method is shown as black line in Figure 5C.For comparison, two additional R t value are provided, which are both based on the previously discussed incidence-based method and assume a fixed serial interval of s = 4 days and a averaging window of τ = 7 days.However, they use different data sources.The R t value shown as blue line is calculated from the same data from [Dong et al., 2020] that is also analyzed by the SIRD-AKS method.The other shown value in mangenta is the officially-issued and so-called seven-day R 4,7 t value from the RKI [Hamouda et al., 2020].The difference is that it does not use the raw data from the CSSE repository, but instead applies an additional socalled Nowcasting pre-processing method that copes with the redistribution of the case numbers for the elimination of reporting delays [Höhle and an der Heiden, 2014].Both incidence-based methods show a similar time course for R t but with peaks that seem to be shifted by a couple of days, presumably because of the case number redistribution.
When compared to the estimate from the SIRD-AKS method, both incidence-based values again lie closer to R t = 1.The time course from the SIRD-AKS shows a larger variety of peak heights in both directions and more deviations from R t = 1 on longer time-scales.This becomes even more prominent when the changes in the reproduction numbers are compared to the time points of new non-pharmaceutical interventions and control measures as indicated by vertical lines.While the incidence-based R t values likewise decrease drastically at the start of the first nation-wide lockdown in March 2020, the end of the first lockdown is only visible as a trend change of the SIRD-AKS reproduction number, but not in the incidence-based approaches.A similar case occurs for the so-called lockdown light in November 2020 and second nationwide lockdown in December 2020 [Fazit Communication GmbH, 2021].Both incidence-based measures barely show any effect on the reproduction number, while the SIRD-AKS estimate shows a fluctuating, yet undoubtedly decreasing trend until the approximate time point where the SARS-CoV-2 Alpha variant spread in Germany.
Figure 5: Application of the SIRD-AKS to SARS-CoV-2 data of Germany from the CSSE repository.A: Incidence data taken from [Dong et al., 2020] for March 2020 until August 2021 is shown in green.SIRD-AKS results for the fluxes of newly infected, newly recovered and newly deceased is shown as black line.B: After an initial rise, the SIRD-AKS estimated for the recovery rate γt stays nearly constant, while the estimated death rate θt shows a dynamic with two peaks at around May-June 2020 and February-March 2021.These time points coincide with periods of high hospitalization states and where mostly older people were infected [Staerk et al., 2021].C: The SIRD-AKS estimates (black) for the reproduction number are displayed in comparison to the seven-day reproduction number based on Nowcast data and published by the RKI in magenta [Hamouda et al., 2020] as well as the calculated R s,τ t incidence-based method in blue that is based on CSSE data.Both incidence-based methods assume a fixed serial time of s = 4 days and averaging window of τ = 7 days.Time points of essential control measures are indicated by colored vertical lines [Fazit Communication GmbH, 2021].

Validation of parameter time course estimates in an ODE model
The presented SIRD-AKS method is based on the transition of the model structure from the time-continuous ODE system to the time-discrete and recursive formulation in the state space.Further, estimates for the time-dependent parameters are in fact augmented states in the state space, interpreted as parameter time courses.We asked whether these parameter time courses produce coherent dynamics when being incorporated into the corresponding ODE system as input functions.
To perform this consistency check, we plugged in the estimated time courses for the parameters R t , γ t and θ t from Figure 5B and C into the ODEs of the reparameterized SIRD model in Equation (20).To choose appropriate initial values for the numerical solution of this ODE system, the values of the SIRD-AKS estimates at time point k = 62, i.e.March 24, 2020, were chosen.This time point corresponds to the first time point where the estimated uncertainties of the time-dependent parameters become adequately small.It also coincides with the starting point of the large initial decrease in the time course of R t , c.f. Figure 5C.Results of the numerical simulation of the time-continuous ODE system with time-dependent parameters are shown in Figure 6 as black line.It is remarkable that all three trajectories are very well in agreement with the flux data for v I , v R and v D that went into the previous AKS analysis, shown as green dots in Figure 6.This confirms that parameter time courses estimated from incidence data by the AKS method, are indeed able to reassemble the correct dynamics in the ODE solutions.The presented AKS method thus qualifies as potential input estimation approach for ODE models in general.
For comparison, the same ODE simulation was performed by utilizing the alternative R t time course calculated from the incidence-based method with fixed serial interval time s = 4 days and averaging interval τ = 7 days, as presented above in Figure 5C as blue line.Since this methods lacks estimates for parameters γ t and θ t as required to solve the ODE system, time course estimates for γ t and θ t from the SIRD-AKS method were chosen.The resulting incidence-based ODE trajectories are displayed as blue line in Figure 6.
In contrast to the ODE trajectories with SIRD-AKS estimates, the ODE solutions with incidence-based R t do not agree well with the data.Although, the overall dynamic activity of the data is approximately covered, it does not well capture certain details of the data series, e.g. the partial drop in the infected flux during the second wave in February 2021.Furthermore, the predicted amplitudes of newly infected individuals are not adequately met, and thus the overall pandemic situation would be underestimated.
It should be noted that the lack of estimates for γ t and θ t from the incidence-based method renders some difficulty for a fair method comparison within this ODE-validation approach.
In conclusion, the incidence-based method does not provide an estimate for the reproduction number R t that can be used as an input function in the addressed ODE model.Instead for the shown real-word application, the incidence-based method R t generates an ODE solution which differs from the observations in orders of magnitude.In contrast, the SIRD-AKS method reassembles an adequate ODE solution, even in the potentially over-simplified SIRD model.

Conclusion
The general task of estimating time-dependent parameters in ODE models, also referred to as input function estimation, is known to be conceptually difficult.Either a particular function class, i.e. a parameterization of a general input function, needs to be provided, or the initial value problem of solving the ODE needs to be transferred into a boundary value problem for which no good optimizers are available [Kaschek and Timmer, 2012].
To remedy this, we present in this work a non-parametric model-based estimation for all timedependent parameters in epidemiological ODE models by combining an Expectation-Maximization algorithm with the Augmented Kalman Smoother (AKS) algorithm.
Our approach is able to estimate the time-dependencies in multiple model parameters and independently from the particular input function class.While other approaches exist that also take advantage of Kalman filter methods for the estimation of a single time-dependency in compartmental models [Arroyo-Marioli et al., 2021], our approach is more general and enables estimation of all time-dependent model parameters simultaneously.
We showed the performance of our AKS method using an SIRD core model structure (SIRD-AKS) in diverse simulation settings.The SIRD-AKS performs well even for high noise levels and under different degrees of model misspecification.Thus, the method also yields meaningful estimates in situations where the details of the data generating process remain unknown, meaning that the assumed AKS model structure does not perfectly cover the true process structure.In particular, the presented approach only utilizes incidence data of infected, recovered and deceased individuals and, as a major advantage compared to existing methods [Khailaie et al., 2021, Hamouda et al., 2020], does not require any further prior knowledge on the analyzed disease.
When applying the SIRD-AKS method to COVID-19 data from Germany, it can be observed that all parameter time course estimates are plausible and in particular the effective reproduction number R t estimate displays the impact of all relevant non-pharmaceutical interventions and applied control measures during the SARS-CoV-2 pandemic in Germany.Further, the simultaneously estimated death rate θ t matches well to the time pattern of how many elderly people were infected by the disease.
We further compared our SIRD-AKS method to an alternative approach for the estimation of R t which considerably depends on the choice for the value of the assumed Dirac δ-distributed serial interval.
In contrast, the SIRD-AKS method is independent of the distribution of the serial interval and therefore, it does not require any information on the assumed serial interval time as a hyperparameter.Instead, our approach internally copes with the serial interval through the ODE model and is not affected by any inadequate assumptions on the serial interval distribution.Even if the serial interval distribution is also subject to change over time, as for example reported in [Ali et al., 2020], this is compensated by the general time-dependency e.g. of the transmission parameter β t in the SIRD model.
Compared to the incidence-based method, the SIRD-AKS estimate for the effective reproduction number R t shows the pandemic situation more clearly as it is less biased towards R t = 1.Furthermore, we checked the appropriateness of the estimated parameter time courses by the SIRD-AKS method by plugging them back into the original ODE model.While the incidencebased estimates yielded case numbers that were orders of magnitude below the original data, we were able to correctly reassemble the case numbers by the ODE model from the SIRD-AKS estimates.
In summary, the SIRD-AKS method is a non-parametric model-based estimation approach for the reproduction number and other time-dependent model parameters that only requires incidence data of newly infected, recovered and deceased individuals.Since by construction the presented approach is not restricted to epidemiological ODE models, it might be used also for input function estimation in non-linear ODE models in general.

Figure 1 :
Figure 1: Estimation of multiple time-dependent parameters.A: The classical SIRD model was simulated with

Figure 2 :
Figure 2: Performance of SIRD-AKS is robust towards high noise levels in the data.A-D: The SIRD modelwas simulated with a time-varying βt and time-constant γt and θt as shown in red.The parameters were chosen as in parameter setting II in Figure1.The data was simulated using increasing noise levels from scenario A to D as shown as green dots for the observed fluxes vD, vI and vR.The SIRD-AKS estimates are shown as black lines with grey error bands for their uncertainties and are compared to the corresponding true time courses.Again, instead of parameter βt from the data generating SIRD model, Rt was calculated from the true parameter time courses and compared to the SIRD-AKS result.While the precision of the SIRD-AKS estimates decreases with higher noise levels, the average dynamics of the parameters time courses are satisfactorily met.

Figure 4 :
Figure4: Comparison of SIRD-AKS estimation and incidence-method with fixed serial interval time.Simulated data was generated from the SECIR model[Khailaie et al., 2021] with five-step function as time-dependent input for parameter R1.The corresponding true time course for Rt is depicted as red line.The SIRD-AKS estimate as shown in black roughly reassembles the true SECIR reproduction number.The R s,τ =4 t value calculated by the incidence-based method from Equation (23) with fixed s = 4 is shown in blue.It is biased towards Rt = 1 (dashed line) and is not able to correctly estimate the true time course.For different choices of the fixed serial interval time hyperparameter s, the incidence-based method may yield better results for R s,4 t , as shown by the grey lines for different values of s = 2, ..., 10.

Figure 6 :
Figure 6: ODE solutions for the SIRD model with estimated time-dependent parameters.The fluxes for vI , vR and vD are calculated from the numerically solved ODEs of the reparameterized SIRD model from Equation (3) and are compared to Germany data from Figure 5A as depicted by green dots.The black lines indicate the ODE solutions when the three estimated time courses for γt, θt and Rr from Figure 5B and C are utilized as time-dependent parameters in the SIRD ODE system.Identical ODE simulations were performed and are shown as blue line, where the Rt time course is taken from the incidence-based method with fixed s = 4 and τ = 7 (c.f. Figure 5C blue line).While the SIRS-AKS estimates yield almost perfect ODE trajectories, the incidence-based method underestimates the dynamics of the pandemic in an SIRD model.