# Epidemiological Forecasting with Model Reduction of Compartmental Models. Application to the COVID-19 Pandemic

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{*}

## Abstract

**:**

## Simple Summary

## Abstract

## 1. Introduction

- Uninfected people, called susceptible (S);
- Infected and contagious people (I), with more or less marked symptoms;
- People removed (R) from the infectious process, either because they were cured or unfortunately died after being infected.

- $\gamma >0$ represents the recovery rate. In other words, its inverse ${\gamma}^{-1}$ can be interpreted as the length (in days) of the contagious period;
- $\beta >0$ is the transmission rate of the disease. It essentially depends on two factors: the contagiousness of the disease and the contact rate within the population. The larger this second parameter is, the faster the transition from susceptible to infectious will be. As a consequence, the number of hospitalized patients may increase very quickly and may lead to a collapse of the health system [13]. Strong distancing measures such as confinement can effectively act on this parameter [14,15], helping to keep it low.

## 2. Methodology for a Single Region

#### 2.1. Compartmental Models

#### 2.1.1. SIR Models with Time-Dependent Parameters

#### 2.1.2. Detailed Compartmental Models

- Second model, SE2IUR: This model is a variant of the one proposed in [9]. It involves five different compartments (see Table 3) and a set of six parameters. The dynamics of the model are illustrated in Figure 2 and the set of equations is as follows:$$\begin{array}{cc}\hfill \frac{dS}{dt}\left(t\right)& =-\frac{1}{N}\beta S\left(t\right)({E}_{2}\left(t\right)+U\left(t\right)+I\left(t\right))\hfill \\ \hfill \frac{d{E}_{1}}{dt}\left(t\right)& =\frac{1}{N}\beta S\left(t\right)({E}_{2}\left(t\right)+U\left(t\right)+I\left(t\right))-\delta {E}_{1}\left(t\right)\hfill \\ \hfill \frac{d{E}_{2}}{dt}\left(t\right)& =\delta {E}_{1}\left(t\right)-\sigma {E}_{2}\left(t\right)\hfill \\ \hfill \frac{dI}{dt}\left(t\right)& =\nu \sigma {E}_{2}\left(t\right)-{\gamma}_{1}I\left(t\right)\hfill \\ \hfill \frac{dU}{dt}\left(t\right)& =(1-\nu )\sigma {E}_{2}\left(t\right)-{\gamma}_{2}U\left(t\right)\hfill \\ \hfill \frac{dR}{dt}\left(t\right)& ={\gamma}_{1}I\left(t\right)+{\gamma}_{2}U\left(t\right)\hfill \end{array}$$We denote by$$\mathbf{u}=\mathrm{SE}2\mathrm{IUR}({\mathbf{u}}_{0},\beta ,\delta ,\sigma ,\nu ,{\gamma}_{1},{\gamma}_{2},[0,T])$$

- Generalization: In the following, we abstract the procedure as follows. For any Detailed_Model with d compartments involving a vector $\mu \in {\mathbb{R}}^{p}$ of p parameters, we denote by$$\mathbf{u}\left(\mu \right)=\mathrm{Detailed}\_\mathrm{Model}(\mu ,[0,\tilde{T}]),\phantom{\rule{1.em}{0ex}}\mathbf{u}\left(\mu \right)\in {L}^{\infty}([0,\tilde{T}],{\mathbb{R}}^{d})$$

#### 2.2. Forecasting Based on Model Reduction of Detailed Models

**Proposition**

**1.**

- (i)
- ${S}_{obs}\left(t\right)+{I}_{obs}\left(t\right)+{R}_{obs}\left(t\right)=N$for every$t\in [0,T]$,
- (ii)
- ${S}_{obs}$in nonincreasing on$[0,T]$,
- (iii)
- ${R}_{obs}$is nondeacreasing on$[0,T]$,

**Proof.**

- Learning phase: The fundamental hypothesis of our approach is that the specialists of epidemiology have understood the mechanisms of infection transmission sufficiently well. The second hypothesis is that these accurate models suffer from the proper determination of the parameters they contain. We thus propose to generate a large number of virtual epidemics, following these mechanistic models, with parameters that can be chosen in the vicinity of the suggested parameter values, including the different initial conditions.
- (a)
- Generate virtual scenarios using detailed models with constant coefficients:
- Define the notion of a Detailed_Model which is most appropriate for the epidemiological study. Several models could be considered. For our numerical application, the detailed models are defined in Section 2.1.
- Define an interval range $\mathcal{P}\subset {\mathbb{R}}^{p}$ where the parameters $\mu $ of etailed_Model vary. We call the solution manifold $\mathcal{U}$ the set of virtual dynamics over $[0,T+\tau ]$, namely$$\mathcal{U}:=\left\{\mathbf{u}\right(\mu )=\mathrm{Detailed}\_\mathrm{Model}(\mu ,[0,T+\tau ])\phantom{\rule{0.277778em}{0ex}}:\phantom{\rule{0.277778em}{0ex}}\mu \in \mathcal{P}\}.$$
- Draw a finite training set$${\mathcal{P}}_{\mathrm{tr}}=\{{\mu}_{1},\dots ,{\mu}_{K}\}\subseteq \mathcal{P}$$$${\mathcal{U}}_{\mathrm{tr}}=\{\mathbf{u}\left(\mu \right)\phantom{\rule{0.277778em}{0ex}}:\phantom{\rule{0.277778em}{0ex}}\mu \in {\mathcal{P}}_{\mathrm{tr}}\}$$

- (b)
- Collapse every detailed model $\mathbf{u}\left(\mu \right)\in {\mathcal{U}}_{\mathrm{tr}}$ into an SIR model following the ideas explained in Section 2.3. For every $\mathbf{u}\left(\mu \right)$, the procedure gives time-dependent parameters $\mathbf{\beta}\left(\mu \right)$ and $\mathbf{\gamma}\left(\mu \right)$ and associated SIR solutions $(\mathbf{S},\mathbf{I},\mathbf{R})\left(\mu \right)$ in $[0,T+\tau ]$. This yields the sets$${\mathcal{B}}_{\mathrm{tr}}:=\{\mathbf{\beta}\left(\mu \right)\phantom{\rule{0.277778em}{0ex}}:\phantom{\rule{0.277778em}{0ex}}\mu \in {\mathcal{P}}_{\mathrm{tr}}\}\phantom{\rule{1.em}{0ex}}\mathrm{and}\phantom{\rule{1.em}{0ex}}{\mathcal{G}}_{\mathrm{tr}}\{\mathbf{\gamma}\left(\mu \right)\phantom{\rule{0.277778em}{0ex}}:\phantom{\rule{0.277778em}{0ex}}\mu \in {\mathcal{P}}_{\mathrm{tr}}\}.$$
- (c)
**Compute reduced models:**We apply model reduction techniques using ${\mathcal{B}}_{\mathrm{tr}}$ and ${\mathcal{G}}_{\mathrm{tr}}$ as training sets in order to build two bases$${\mathrm{B}}_{n}=\mathrm{span}\{{b}_{1},\dots ,{b}_{n}\},\phantom{\rule{1.em}{0ex}}{\mathrm{G}}_{n}=\mathrm{span}\{{g}_{1},\dots ,{g}_{n}\}\subset {L}^{\dots}([0,T+\tau ],\mathbb{R}),$$

- Fitting on the reduced spaces: We next solve the fitting problem (2) in the interval $[0,T]$ by searching $\mathbf{\beta}$ (or $\mathbf{\gamma}$) in ${\mathrm{B}}_{n}$ (or in ${\mathrm{G}}_{n}$) instead of in ${L}^{\infty}\left([0,T]\right)$; that is,$${J}_{({\mathrm{B}}_{n},{\mathrm{G}}_{n})}^{*}=\underset{(\mathbf{\beta},\mathbf{\gamma})\in {\mathrm{B}}_{n}\times {\mathrm{G}}_{n}}{min}\mathcal{J}(\mathbf{\beta},\mathbf{\gamma}\phantom{\rule{0.277778em}{0ex}}|\phantom{\rule{0.277778em}{0ex}}{\mathbf{I}}_{\mathrm{obs}},{\mathbf{R}}_{\mathrm{obs}},[0,T]).$$Note that the respective dimensions of ${\mathrm{B}}_{n}$ and ${\mathrm{G}}_{n}$ can be different; for simplicity, we consider them to be equal in the following. Obviously, since ${\mathrm{B}}_{n}$ and ${\mathrm{G}}_{n}\subset {L}^{\infty}\left([0,T]\right)$, we obtain$${J}^{*}\le {J}_{({\mathrm{B}}_{n},{\mathrm{G}}_{n})}^{*},$$The solution of problem (5) gives us the coefficients ${\left({c}_{i}^{*}\right)}_{i=1}^{n}$ and ${\left({\tilde{c}}_{i}^{*}\right)}_{i=1}^{n}\in {\mathbb{R}}^{n}$ such that the time-dependent parameters$$\begin{array}{cc}\hfill {\beta}_{n}^{*}\left(t\right)& =\sum _{i=1}^{n}{c}_{i}^{*}{b}_{i}\left(t\right),\phantom{\rule{1.em}{0ex}}\forall t\in [0,T+\tau ],\hfill \\ \hfill {\gamma}_{n}^{*}\left(t\right)& =\sum _{i=1}^{n}{\tilde{c}}_{i}^{*}{g}_{i}\left(t\right).\hfill \end{array}$$
- Forecast: For a given dimension n of the reduced spaces, we propagate in $[0,T+\tau ]$ the associated SIR model, as follows:$$({\mathbf{S}}_{n}^{*},{\mathbf{I}}_{n}^{*},{\mathbf{R}}_{n}^{*})=\mathrm{SIR}({\mathbf{\beta}}_{n}^{*},{\mathbf{\gamma}}_{n}^{*},[0,T+\tau ])$$The values ${I}_{n}^{*}\left(t\right)$ and ${R}_{n}^{*}\left(t\right)$ for $t\in [0,T[$ are by construction close to the observed data ${\mathbf{I}}_{\mathrm{obs}},{\mathbf{R}}_{\mathrm{obs}}$ (up to some numerical optimization error). The values ${I}_{n}^{*}\left(t\right)$ and ${R}_{n}^{*}\left(t\right)$ for $t\in [T,T+\tau ]$ are then used for prediction.
- Forecast combination/aggregation of experts (optional step): By varying the dimension n and using different model reduction approaches, we can easily produce a collection of different forecasts, and thus the question of how to select the best predictive model arises. Alternatively, we can also resort to forecast combination techniques [17]: denoting $({I}_{1},{R}_{1}),\dots ,({I}_{P},{R}_{P})$ as the different forecasts, the idea is to search for an appropriate linear combination$${I}^{\mathrm{FC}}\left(t\right)=\sum _{p=1}^{P}{w}_{p}{I}_{p}\left(t\right)$$

- To bring out the essential mechanisms, we have idealized some elements in the above discussion by omitting certain unavoidable discretization aspects. To start with, the ODE solutions cannot be computed exactly but only up to a certain level of accuracy given by a numerical integration scheme. In addition, the optimal control problems (2) and (5) are non-convex. As a result, in practice, we can only find a local minimum. Note, however, that modern solvers find solutions which are very satisfactory for all practical purposes. In addition, note that solving the control problem in a reduced space as in (5) could be interpreted as introducing a regularizing effect with respect to the control problem (2) in the full ${L}^{\infty}\left([0,T]\right)$ space. It is to be expected that the search of global minimizers is facilitated in the reduced landscape.
- routine-IR and routine-$\mathbf{\beta}$$\mathbf{\gamma}$: A variant for the fitting problem (5) as studied in our numerical experiments is to replace the cost function $\mathcal{J}(\mathbf{\beta},\mathbf{\gamma}\phantom{\rule{0.277778em}{0ex}}|\phantom{\rule{0.277778em}{0ex}}{\mathbf{I}}_{\mathrm{obs}},{\mathbf{R}}_{\mathrm{obs}},[0,T])$ by the cost function$$\tilde{\mathcal{J}}(\mathbf{\beta},\mathbf{\gamma}\phantom{\rule{0.166667em}{0ex}}|\phantom{\rule{0.166667em}{0ex}}{\mathbf{\beta}}_{\mathrm{obs}}^{*},{\mathbf{\gamma}}_{\mathrm{obs}}^{*},[0,T]):={\int}_{0}^{T}\left(\right)open="("\; close=")">|\mathbf{\beta}-{\mathbf{\beta}}_{\mathrm{obs}}^{*}{|}^{2}+|\mathbf{\gamma}-{\mathbf{\gamma}}_{\mathrm{obs}}^{*}{\left)\right|}^{2}$$In other words, we use the variables ${\mathbf{\beta}}_{\mathrm{obs}}^{*}$ and ${\mathbf{\gamma}}_{\mathrm{obs}}^{*}$ from (3) as observed data instead of working with the observed health series ${\mathbf{I}}_{\mathrm{obs}}$, ${\mathbf{R}}_{\mathrm{obs}}$. In Section 4, we refer to the standard fitting method as routine-IR and to this variant as routine-$\mathbf{\beta}$$\mathbf{\gamma}$.
- The fitting procedure works both on the components of the reduced basis and the initial time of the epidemics to minimize the loss function; however, for simplicity, this last optimization is not reported here.

#### 2.3. Details on Step 1-(b): Collapsing the Detailed Models into SIR Dynamics

#### 2.4. Details of Model Order Reduction

- SVD: The eigenvalue decomposition of the correlation matrix ${M}_{\mathcal{B}}^{T}{M}_{\mathcal{B}}\in {\mathbb{R}}^{K\times K}$ gives$${M}_{\mathcal{B}}^{T}{M}_{\mathcal{B}}=V\mathsf{\Lambda}{V}^{T},$$$${\mathbf{b}}_{i}=\sum _{j=1}^{K}{v}_{j,i}{\mathbf{\beta}}_{j},\phantom{\rule{1.em}{0ex}}1\le i\le K.$$For $n\le K$, the space$${\mathrm{B}}_{n}=\mathrm{span}\{{\mathbf{b}}_{1},\dots {\mathbf{b}}_{n}\}$$$${\delta}_{n}{\left({\left\{{\mathbf{\beta}}_{i}\right\}}_{i=1}^{K}\right)}_{{\ell}^{2}\left({\mathbb{R}}^{Q+1}\right)}={\left(\right)}^{\frac{1}{K}}1/2$$Therefore, the SVD method is particularly efficient if there is a fast decay of the eigenvalues, meaning that the set ${\mathcal{B}}_{\mathrm{tr}}={\left\{{\mathbf{\beta}}_{i}\right\}}_{i=1}^{K}$ can be approximated by only few modes. However, note that, by construction, this method does not ensure positivity in the sense that ${P}_{{\mathrm{B}}_{n}}{\beta}_{i}\left(t\right)$ may become negative for some $t\in [0,T]$ although the original function ${\mathbf{\beta}}_{i}\left(t\right)\ge 0$ for all $t\in [0,T]$. This is due to the fact that the vectors ${\mathbf{b}}_{i}$ are not necessarily nonnegative. As we will see later, in our study, ensuring positivity especially for extrapolation (i.e., forecasting) is particularly important and motivates the next methods.
- Nonnegative Matrix Factorization (NMF, see [26,27]): NMF is a variant of SVD involving nonnegative modes and expansion coefficients. In this approach, we build a family of non-negative functions ${\left\{{\mathbf{b}}_{i}\right\}}_{i=1}^{n}$ and we approximate each ${\mathbf{\beta}}_{i}$ with a linear combination$${\mathbf{\beta}}_{i}^{\mathrm{NMF}}=\sum _{j=1}^{n}{a}_{i,j}{\mathbf{b}}_{j},\phantom{\rule{1.em}{0ex}}1\le i\le K,$$$$({W}^{*},{B}^{*})\in \underset{(W,B)\in {\mathbb{R}}_{+}^{K\times n}\times {\mathbb{R}}_{+}^{n\times Q}}{arg\; min}{\parallel {M}_{\mathcal{B}}-WB\parallel}_{F}^{2}.$$We refer readers to [27] for further details on the NMF and its numerical aspects.
- The Enlarged Nonnegative Greedy (ENG) algorithm with projection on an extended cone of positive functions: We now present our novel model reduction method, which is of interest in itself as it allows reduced models that preserve positivity and even other types of bounds to be built. The method stems from the observation that NMF approximates functions in the cone of positive functions of $\mathrm{span}{\{{\mathbf{b}}_{i}\ge 0\}}_{i=1}^{n}$ since it imposes ${a}_{i,j}\ge 0$ in the linear combination (8). However, note that the positivity of the linear combination is not equivalent to the positivity of the coefficients ${a}_{i,j}$ since there are obviously linear combinations involving very small ${a}_{i,j}<0$ for some j which may still deliver a nonnegative linear combination ${\sum}_{j=1}^{n}{a}_{i,j}{\mathbf{b}}_{j}$. We can thus widen the cone of linear combinations yielding positive values by carefully including these negative coefficients ${a}_{i,j}$. One possible strategy for this is proposed in Algorithm 1, which describes a routine that we call Enlarge_Cone. The routine$$\{{\mathbf{\psi}}_{1},\dots ,{\mathbf{\psi}}_{n}\}=\mathrm{Enlarge}\_\mathrm{Cone}[\{{\mathbf{b}}_{1},\dots ,{\mathbf{b}}_{n}\},\epsilon ]$$$${\mathbf{\psi}}_{i}={\mathbf{b}}_{i}-\sum _{j\ne i}{\sigma}_{j}^{i}{\mathbf{b}}_{j},\phantom{\rule{1.em}{0ex}}\forall i\in \{1,\dots ,n\}$$Once we have run Enlarge_Cone, the approximation of any function $\mathbf{\beta}$ is then sought as$${\mathbf{\beta}}^{\left(EC\right)}=\underset{{c}_{1},\dots ,{c}_{n}\ge 0}{arg\; min}{\parallel \mathbf{\beta}-\sum _{j=1}^{n}{c}_{j}{\mathbf{\psi}}_{j}\parallel}_{{L}^{2}\left([0,T+\tau ]\right)}^{2}$$We emphasize that the routine is valid for any set of nonnegative input functions. We can thus apply Enlarge_Cone to the functions ${\{{\mathbf{b}}_{i}\ge 0\}}_{i=1}^{n}$ from NMF but also to the functions selected by a greedy algorithm such as the following:
- For $n=1$, find$${\mathbf{b}}_{1}=\underset{\mathbf{\beta}\in {\left\{{\mathbf{\beta}}_{i}\right\}}_{i=1}^{K}}{arg\; max}{\parallel \mathbf{\beta}\parallel}_{{L}^{2}\left([0,T+\tau ]\right)}^{2}$$
- At step $n>1$, we have selected the set of functions $\{{\mathbf{b}}_{1},\dots ,{\mathbf{b}}_{n-1}\}$. We next find$${\mathbf{b}}_{n}=\underset{\mathbf{\beta}\in {\left\{{\mathbf{\beta}}_{i}\right\}}_{i=1}^{K}}{arg\; max}\underset{{c}_{1},\dots ,{c}_{n}\ge 0}{min}{\parallel \mathbf{\beta}-\sum _{j=1}^{n-1}{c}_{j}{\mathbf{b}}_{j}\parallel}_{{L}^{2}\left([0,T+\tau ]\right)}^{2}$$

In our numerical tests, we call the Enlarged Nonnegative Greedy (ENG) method the routine involving the above greedy algorithm combined with our routine Enlarge_Cone.**Algorithm 1**Enlarge_Cone$[\{{\mathbf{b}}_{1},\dots ,{\mathbf{b}}_{n}\},\epsilon ]\to \{{\mathbf{\psi}}_{1},\dots ,{\mathbf{\psi}}_{n}\}$.**Input:**Set of nonnegative functions $\{{\mathbf{b}}_{1},\dots ,{\mathbf{b}}_{n}\}$. Tolerance $\epsilon >0$.

**for**$i\in \{1,\dots ,n\}$**do**

Set ${\sigma}_{j}^{i}=0,\phantom{\rule{1.em}{0ex}}\forall j\ne i.$

**for**$\ell \in \{1,\dots ,n\}$**do**

${\alpha}_{\ell}^{i,*}=arg\; max\{\alpha \ge 0\phantom{\rule{0.277778em}{0ex}}:\phantom{\rule{0.277778em}{0ex}}\left(\right)open="["\; close="]">{\mathbf{b}}_{i}-{\sum}_{j\ne i}{\sigma}_{j}^{i}{\mathbf{b}}_{j}-\alpha {\mathbf{b}}_{\ell}\left(t\right)0,\phantom{\rule{4pt}{0ex}}\phantom{\rule{4pt}{0ex}}\forall t\in [0,T+\tau ]\}$

${\sigma}_{\ell}^{i}={\sigma}_{\ell}^{i}+\frac{{\alpha}_{\ell}^{i,*}}{2}$

**while**${\alpha}_{\ell}^{i,*}\ge \epsilon $**do**

${\alpha}_{\ell}^{i,*}=arg\; max\{\alpha \ge 0\phantom{\rule{0.277778em}{0ex}}:\phantom{\rule{0.277778em}{0ex}}\left(\right)open="["\; close="]">{\mathbf{b}}_{i}-{\sum}_{j\ne i}{\sigma}_{j}^{i}{\mathbf{b}}_{j}-\alpha {\mathbf{b}}_{\ell}\left(t\right)0,\phantom{\rule{4pt}{0ex}}\phantom{\rule{4pt}{0ex}}\forall t\in [0,T+\tau ]\}$

${\sigma}_{\ell}^{i}={\sigma}_{\ell}^{i}+\frac{{\alpha}_{\ell}^{i,*}}{2}$

**end while**

**end for**

${\mathbf{\psi}}_{i}={\mathbf{b}}_{i}-{\sum}_{j\ne i}{\sigma}_{j}^{i}{\mathbf{b}}_{j}$

**end for****Output:**$\{{\mathbf{\psi}}_{1},\dots ,{\mathbf{\psi}}_{n}\}$We remark that, if we work with positive functions that are upper bounded by a constant $L>0$, we can ensure that the approximations, denoted as $\Psi $, and written as a linear combination of basis functions, will also be between these bounds 0 and L by defining on the one hand, and as we have just done, a cone of positive functions generated by the above family ${\left\{{\psi}_{i}\right\}}_{i}$, and on the other hand, considering the base of the functions $L-\phi $, $\phi $ to be above the set all greedy elements of the reduced basis, to which we also apply the enlargement of these positive functions. We then require the approximation to be written as a positive combination of the first (positive) functions and for $L-\mathsf{\Psi}$ to also be written as a combination with positive components in the second basis.In this frame, the approximation appears under the form of a least-square approximation with $2n$ linear constraints on the n coefficients, expressing the fact that the coefficients are nonnegative in the two above transformed bases.

**Remark**

**1.**

**Reduced models on $\mathcal{I}=\left\{\mathit{I}\right(\mu )\phantom{\rule{0.166667em}{0ex}}:\phantom{\rule{0.166667em}{0ex}}\mu \in \mathcal{P}\}$ and $\mathcal{R}=\left\{\mathit{R}\right(\mu )\phantom{\rule{0.166667em}{0ex}}:\phantom{\rule{0.166667em}{0ex}}\mu \in \mathcal{P}\}$**Instead of applying model reduction to the sets $\mathcal{B}$ and $\mathcal{G}$, as we do in our approach, we could apply the above techniques directly to the sets of solutions $\mathcal{I}$ and $\mathcal{R}$ of the SIR models with time-dependent coefficients in $\mathcal{B}$ and $\mathcal{G}$. In this case, the resulting approximation would however not follow SIR dynamics.

## 3. Methodology for Multiple Regions Including Population Mobility Data

#### 3.1. Multi-Regional Compartmental Models

- In a Eulerian description, we take the regions as fixed references for which we record incoming and outgoing travels;
- In a Lagrangian description, we follow the motion of people living in a certain region and record their travels in the territory. We can expect this modeling to be more informative regarding the geographical spread of the disease, but it comes at the cost of additional details regarding the home region of the population.

#### 3.1.1. Multi-Regional SIR Models with Time-Dependent Parameters

- The model is consistent in the sense that it satisfies (9), and when $P=1$, we recover the traditional SIR model;
- Under lockdown measures, ${\lambda}_{i\to j}\approx {\delta}_{i,j}$ and the population ${N}_{i}\left(t\right)$ remains practically constant. As a result, the evolution of each region is decoupled from the others, and each region can be addressed with a mono-regional approach;
- The use of ${\beta}_{j}$ in Equation (11) is debatable. When people from region j arrive in region i, it may be reasonable to assume that the contact rate is ${\beta}_{i}$;
- The use of ${\lambda}_{j\to i}$ in Equation (11) is also very debatable. The probability ${\lambda}_{j\to i}$ was originally defined to account for the mobility of people from region j to region i without specifying the compartment. However, in Equation (11), we need the probability of mobility of infectious people from region j to region i, which we denote by ${\mu}_{j\to i}$ in the following. It seems reasonable to think that ${\mu}_{j\to i}$ may be smaller than ${\lambda}_{j\to i}$, because as soon as people become symptomatic and suspect an illness, they will probably not move. Two possible options would be as follows:
- -
- We could try to estimate ${\mu}_{j\to i}$. If symptoms arise, for example, 2 days after infection and if people recover in 15 days on average, then we could say that ${\mu}_{j\to i}=2/15{\lambda}_{j\to i}$.
- -
- As the above seems to be quite empirical, another option would be to use ${\lambda}_{j\to i}$ and absorb the uncertainty in the values of the ${\beta}_{j}$ that can be fitted.

- We choose not to add mobility in the R compartment as this does not modify the dynamics of the epidemic spread; only adjustments in the population sizes are needed.

#### 3.1.2. Detailed Multi-Regional Models with Constant Coefficients

#### 3.2. Forecasting for Multiple Regions with Population Mobility

**Proposition**

**2.**

**Proof.**

## 4. Numerical Results

#### 4.1. Data

#### 4.2. Results

- $\mathrm{routine}-\mathrm{IR}$: loss function $\mathcal{J}(\mathbf{\beta},\mathbf{\gamma}\phantom{\rule{0.277778em}{0ex}}|\phantom{\rule{0.277778em}{0ex}}{\mathbf{I}}_{\mathrm{obs}},{\mathbf{R}}_{\mathrm{obs}},[0,T])$ from (1),
- $\mathrm{routine}-$$\mathbf{\beta}$$\mathbf{\gamma}$: loss function $\tilde{\mathcal{J}}(\mathbf{\beta},\mathbf{\gamma}\phantom{\rule{0.166667em}{0ex}}|\phantom{\rule{0.166667em}{0ex}}{\mathbf{\beta}}_{\mathrm{obs}}^{*},{\mathbf{\gamma}}_{\mathrm{obs}}^{*},[0,T])$ from (6)

#### 4.2.1. Forecasting for the First Pandemic Wave with a 14 Day Horizon

- ${I}_{SVD}$, ${I}_{NMF}$, ${I}_{ENG}$, ${R}_{SVD}$, ${R}_{NMF}$, ${R}_{ENG}$ are the average forecasts obtained using routine-$\mathbf{\beta}$$\mathbf{\gamma}$.
- ${I}_{SVD}^{*}$, ${I}_{NMF}^{*}$, ${I}_{ENG}^{*}$, ${R}_{SVD}^{*}$, ${R}_{NMF}^{*}$, ${R}_{ENG}^{*}$ are the average forecasts obtained using routine-IR.

#### 4.2.2. Focus on the Forecasting with ENG

#### 4.2.3. Forecasting of the Second Wave with ENG

#### 4.2.4. Forecasts with ENG with a 28 Day-Ahead Window

## 5. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Appendix A. Analysis of Model Error and Observation Noise

- Observation noise: The observed health data ${\mathbf{I}}_{\mathrm{obs}}$ and ${\mathbf{R}}_{\mathrm{obs}}$ present several sources of noise. There may be some inaccuracies in the reporting of cases, and the data are then post-processed. This eventually yields the noisy time series ${\mathbf{I}}_{\mathrm{obs}}$ and ${\mathbf{R}}_{\mathrm{obs}}$ for which it is difficult to estimate the uncertainty. These noisy data are then used to produce (through finite differences) ${\beta}_{\mathrm{obs}}^{*}$ and ${\gamma}_{\mathrm{obs}}^{*}$, which are in turn also noisy.
- Model errors: Two types of model errors are identified:
- (a)
- Intrinsic model error on $\mathcal{B}$ and $\mathcal{G}$: The families of detailed models that we use are rich, but they may not cover all possible evolutions of ${\mathbf{I}}_{\mathrm{obs}}$ and ${\mathbf{R}}_{\mathrm{obs}}$. In other words, our manifolds $\mathcal{B}$ and $\mathcal{G}$ may not perfectly cover the real epidemiological evolution. Thsi error motivated the introduction of the exponential functions ${b}_{0}$ and ${g}_{0}$ described in Section 2.4.
- (b)
- Sampling error of $\mathcal{B}$ and $\mathcal{G}$: The size of the training sets ${\mathcal{B}}_{\mathrm{tr}}$ and ${\mathcal{G}}_{\mathrm{tr}}$ and the dimension n of the reduced models ${\mathrm{B}}_{n}$ and ${\mathrm{G}}_{n}$ also limit the approximation capabilities of the continuous target sets $\mathcal{B}$ and $\mathcal{G}$.

#### Appendix A.1. Study of the Impact of the Sampling Error: Fitting a Virtual Scenario

**Figure A2.**Study of sampling errors: Fitting errors of functions $\mathbf{\beta}$ and $\mathbf{\gamma}$ from $\mathcal{B}$ and $\mathcal{G}$ that belong to the training sets ${\mathcal{B}}_{\mathrm{tr}}$ and ${\mathcal{G}}_{\mathrm{tr}}$.

**Figure A3.**Study of sampling errors: Errors on $\mathbf{I}$ and $\mathbf{R}$ obtained by the susceptible–infected–removed (SIR) model using the fitted $\mathbf{\beta}$ and $\mathbf{\gamma}$.

**Figure A4.**Study of sampling errors: Fitting errors of functions $\mathbf{\beta}$ and $\mathbf{\gamma}$ from $\mathcal{B}$ and $\mathcal{G}$ but which do not belong to the training sets ${\mathcal{B}}_{\mathrm{tr}}$ and ${\mathcal{G}}_{\mathrm{tr}}$.

**Figure A5.**Study of sampling errors: Errors on $\mathbf{I}$ and $\mathbf{R}$ obtained by the SIR model using the fitted $\mathbf{\beta}$ and $\mathbf{\gamma}$.

#### Appendix A.2. Study of the Impact of Noisy Data and Intrinsic Model Error

## Appendix B. Study of Forecasting Errors

## References

- Ceylan, Z. Estimation of COVID-19 prevalence in Italy, Spain, and France. Sci. Total. Environ.
**2020**, 729, 138817. [Google Scholar] [CrossRef] - Anastassopoulou, C.; Russo, L.; Tsakris, A.; Siettos, C. Data-based analysis, modelling and forecasting of the COVID-19 outbreak. PLoS ONE
**2020**, 15, e0230405. [Google Scholar] [CrossRef] [Green Version] - Fang, X.; Liu, W.; Ai, J.; He, M.; Wu, Y.; Shi, Y.; Shen, W.; Bao, C. Forecasting incidence of infectious diarrhea using random forest in Jiangsu Province, China. BMC Infect. Dis.
**2020**, 20, 1–8. [Google Scholar] [CrossRef] [PubMed] - Roda, W.C.; Varughese, M.B.; Han, D.; Li, M.Y. Why is it difficult to accurately predict the COVID-19 epidemic? Infect. Dis. Model.
**2020**, 5, 271–281. [Google Scholar] [CrossRef] [PubMed] - Liu, Z.; Magal, P.; Seydi, O.; Webb, G. Predicting the cumulative number of cases for the COVID-19 epidemic in China from early data. arXiv
**2020**, arXiv:2002.12298. [Google Scholar] [CrossRef] [Green Version] - Keeling, M.J.; Rohani, P. Modeling Infectious Diseases in Humans and Animals; Princeton University Press: Princeton, NJ, USA, 2011. [Google Scholar]
- Martcheva, M. An Introduction to Mathematical Epidemiology; Springer: Berlin/Heidelberg, Germany, 2015; Volume 61. [Google Scholar]
- Brauer, F. Mathematical epidemiology: Past, present, and future. Infect. Dis. Model.
**2017**, 2, 113–127. [Google Scholar] [CrossRef] [PubMed] - Magal, P.; Webb, G. Predicting the number of reported and unreported cases for the COVID-19 epidemic in South Korea, Italy, France and Germany. Medrxiv
**2020**, 509, 110501. [Google Scholar] [CrossRef] - Di Domenico, L.; Pullano, G.; Sabbatini, C.E.; Boëlle, P.Y.; Colizza, V. Impact of lockdown in Île-de-France and possible exit strategies. BMC Med.
**2020**, 18, 240. [Google Scholar] [CrossRef] - Flaxman, S.; Mishra, S.; Gandy, A.; Unwin, H.J.T.; Mellan, T.A.; Coupland, H.; Whittaker, C.; Zhu, H.; Berah, T.; Eaton, J.W.; et al. Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe. Nature
**2020**, 584, 257–261. [Google Scholar] [CrossRef] - Kermack, W.O.; McKendrick, A.G.; Walker, G.T. A contribution to the mathematical theory of epidemics. Proc. R. Soc. Lond. Ser. Contain. Pap. Math. Phys. Character
**1927**, 115, 700–721. [Google Scholar] [CrossRef] [Green Version] - Armocida, B.; Formenti, B.; Ussai, S.; Palestra, F.; Missoni, E. The Italian health system and the COVID-19 challenge. Lancet Public Health
**2020**, 5, e253. [Google Scholar] [CrossRef] - Roques, L.; Klein, E.K.; Papaïx, J.; Sar, A.; Soubeyrand, S. Impact of Lockdown on the Epidemic Dynamics of COVID-19 in France. Front. Med.
**2020**, 7, 274. [Google Scholar] [CrossRef] [PubMed] - Ferguson, N.; Laydon, D.; Nedjati Gilani, G.; Imai, N.; Ainslie, K.; Baguelin, M.; Bhatia, S.; Boonyasiri, A.; Cucunuba Perez, Z.; Cuomo-Dannenburg, G.; et al. Report 9: Impact of Non-Pharmaceutical Interventions (NPIs) to Reduce COVID19 Mortality and Healthcare Demand. Imp. Coll. Lond.
**2020**, 10, 77482. [Google Scholar] [CrossRef] - Liu, Z.; Magal, P.; Seydi, O.; Webb, G. A COVID-19 epidemic model with latency period. Infect. Dis. Model.
**2020**, 5, 323–337. [Google Scholar] - Poncela, P.; Rodríguez, J.; Sánchez-Mangas, R.; Senra, E. Forecast combination through dimension reduction techniques. Int. J. Forecast.
**2011**, 27, 224–237. [Google Scholar] [CrossRef] - Binev, P.; Cohen, A.; Dahmen, W.; DeVore, R.; Petrova, G.; Wojtaszczyk, P. Convergence Rates for Greedy Algorithms in Reduced Basis Methods. SIAM J. Math. Anal.
**2011**, 43, 1457–1472. [Google Scholar] [CrossRef] [Green Version] - DeVore, R.; Petrova, G.; Wojtaszczyk, P. Greedy algorithms for reduced bases in Banach spaces. Constr. Approx.
**2013**, 37, 455–466. [Google Scholar] [CrossRef] [Green Version] - Cohen, A.; DeVore, R. Kolmogorov widths under holomorphic mappings. IMA J. Numer. Anal.
**2016**, 36, 1–12. [Google Scholar] [CrossRef] [Green Version] - Maday, Y.; Mula, O.; Turinici, G. Convergence analysis of the Generalized Empirical Interpolation Method. SIAM J. Numer. Anal.
**2016**, 54, 1713–1731. [Google Scholar] [CrossRef] [Green Version] - Quarteroni, A.; Manzoni, A.; Negri, F. Reduced Basis Methods for Partial Differential Equations: An Introduction; Springer: Berlin/Heidelberg, Germany, 2015; Volume 92. [Google Scholar]
- Benner, P.; Cohen, A.; Ohlberger, M.; Willcox, K. Model Reduction and Approximation: Theory and Algorithms; SIAM: Philadelphia, PA, USA, 2017; Volume 15. [Google Scholar]
- Hesthaven, J.S.; Rozza, G.; Stamm, B. Certified Reduced Basis Methods for Parametrized Partial Differential Equations; Springer: Berlin/Heidelberg, Germany, 2016; Volume 590. [Google Scholar]
- Maday, Y.; Patera, A. Reduced basis methods. In Model Order Reduction; Benner, P., Grivet-Talocia, S., Quarteroni, A., Rozza, G., Schilders, W., Silveira, L., Eds.; De Gruyter: Oxford, UK, 16 December 2020; Chapter 4; pp. 139–179. [Google Scholar]
- Paatero, P.; Tapper, U. Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics
**1994**, 5, 111–126. [Google Scholar] [CrossRef] - Gillis, N. The why and how of nonnegative matrix factorization. Regul. Optim. Kernels Support Vector Mach.
**2014**, 12, 257–291. [Google Scholar] - Atif, J.; Cappé, O.; Kazakçi, A.; Léo, Y.; Massoulié, L.; Mula, O. Initiative Face au Virus Observations sur la Mobilité Pendant L’épidémie de COVID-19; Technical Report; PSL University: Paris, France, 2020. [Google Scholar]
- Mizumoto, K.; Kagaya, K.; Zarebski, A.; Chowell, G. Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship, Yokohama, Japan, 2020. Eurosurveillance
**2020**, 25, 2000180. [Google Scholar] [CrossRef] [Green Version]

**Figure 4.**${\beta}_{\mathrm{obs}}^{*}$, ${\gamma}_{\mathrm{obs}}^{*}$, ${R}_{0,\mathrm{obs}}^{*}={\beta}_{\mathrm{obs}}^{*}/{\gamma}_{\mathrm{obs}}^{*}$ deduced from the data from ${t}_{0}=19/03/2020$ to $T=20/05/2020$

Compartment | Description |
---|---|

S | Susceptible |

E | Exposed (non infectious) |

${I}_{p}$ | Infected and pre-symptomatic (already infectious) |

${I}_{a}$ | Infected and a-symptomatic (but infectious) |

${I}_{ps}$ | Infected and paucisymptomatic |

${I}_{ms}$ | Infected with mild symptoms |

${I}_{ss}$ | Infected with severe symptoms |

H | Hospitalized |

C | Intensive Care Unit |

R | Removed |

D | Dead |

Parameter | Description |
---|---|

${\beta}_{p}$ | Relative infectiousness of ${I}_{p}$ |

${\beta}_{a}$ | Relative infectiousness of ${I}_{a}$ |

${\beta}_{ps}$ | Relative infectiousness of ${I}_{ps}$ |

${\beta}_{ms}$ | Relative infectiousness of ${I}_{ms}$ |

${\beta}_{ss}$ | Relative infectiousness of ${I}_{ss}$ |

${\beta}_{H}$ | Relative infectiousness of ${I}_{H}$ |

${\beta}_{C}$ | Relative infectiousness of ${I}_{C}$ |

${\epsilon}^{-1}$ | Latency period |

${\mu}_{p}^{-1}$ | Duration of prodromal phase |

${p}_{a}$ | Probability of being asymptomatic |

${\mu}^{-1}$ | Infectious period of ${I}_{a}$, ${I}_{ps}$, ${I}_{ms}$, ${I}_{ss}$ |

${p}_{ps}$ | If symptomatic, probability of being paucisymptomatic |

${p}_{ms}$ | If symptomatic, probability of developing mild symptoms |

${p}_{ss}$ | If symptomatic, probability of developing severe symptoms (note that ${p}_{ps}+{p}_{ms}+{p}_{ss}=1$) |

${p}_{C}$ | If severe symptoms, probability of going in C |

${\lambda}_{CR}$ | If in C, daily rate entering in R |

${\lambda}_{CD}$ | If in C, daily rate entering in D |

${\lambda}_{HR}$ | If hospitalized, daily rate entering in R |

${\lambda}_{HD}$ | If hospitalized, daily rate entering in D |

Compartment | Description |
---|---|

S | Susceptible |

${E}_{1}$ | Exposed (non infectious) |

${E}_{2}$ | Infected and pre-symptomatic (already infectious) |

I | Infected and symptomatic |

U | Un-noticed |

R | dead and removed |

Parameter | Description |
---|---|

$\beta $ | Relative infectiousness of I, U, ${E}_{2}$ |

${\delta}^{-1}$ | Latency period |

${\sigma}^{-1}$ | Duration of prodromal phase |

$\nu $ | Proportion of I among $I+U$ |

${\gamma}_{1}$ | If I, daily rate entering in R |

${\gamma}_{2}$ | If U, daily rate entering in R |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Bakhta, A.; Boiveau, T.; Maday, Y.; Mula, O.
Epidemiological Forecasting with Model Reduction of Compartmental Models. Application to the COVID-19 Pandemic. *Biology* **2021**, *10*, 22.
https://doi.org/10.3390/biology10010022

**AMA Style**

Bakhta A, Boiveau T, Maday Y, Mula O.
Epidemiological Forecasting with Model Reduction of Compartmental Models. Application to the COVID-19 Pandemic. *Biology*. 2021; 10(1):22.
https://doi.org/10.3390/biology10010022

**Chicago/Turabian Style**

Bakhta, Athmane, Thomas Boiveau, Yvon Maday, and Olga Mula.
2021. "Epidemiological Forecasting with Model Reduction of Compartmental Models. Application to the COVID-19 Pandemic" *Biology* 10, no. 1: 22.
https://doi.org/10.3390/biology10010022