Estimation of Daily Reproduction Numbers during the COVID-19 Outbreak

: (1) Background: The estimation of daily reproduction numbers throughout the contagiousness period is rarely considered, and only their sum R 0 is calculated to quantify the contagiousness level of an infectious disease. (2) Methods: We provide the equation of the discrete dynamics of the epidemic’s growth and obtain an estimation of the daily reproduction numbers by using a deconvolution technique on a series of new COVID-19 cases. (3) Results: We provide both simulation results and estimations for several countries and waves of the COVID-19 outbreak. (4) Discussion: We discuss the role of noise on the stability of the epidemic’s dynamics. (5) Conclusions: We consider the possibility of improving the estimation of the distribution of daily reproduction numbers during the contagiousness period by taking into account the heterogeneity due to several host age classes.


Overview and Literature Review
Following the severe acute respiratory syndrome outbreak caused by coronavirus SARS CoV-1 in 2002 [1] and the Middle East Respiratory Syndrome outbreak caused by coronavirus MERS-CoV in 2012 [2], the COVID-19 disease caused by coronavirus SARS CoV-2 is the third coronavirus outbreak to occur in the past two decades. Human coronaviruses, including 229E, OC43, NL63 and HKU1, are a group of viruses that cause a significant percentage of all common colds in humans [3]. SARS CoV-2 can be transmitted from person to person by respiratory droplets and through contact and fomites. Therefore, the severity of disease symptoms, such as cough and sputum, and their viral load, are often the most important factors in the virus's ability to spread, and these factors can change rapidly within only a few days during an individual's period of contagiousness. This ability to spread is quantified by the basic reproduction number R 0 (also called the average reproductive rate), a classical epidemiologic parameter that describes the transmissibility of an infectious disease and is equal to the number of susceptible individuals that an infectious individual can transmit the disease to during his contagiousness period. For contagious diseases, the transmissibility is not a biological constant: it is affected by numerous factors, including endogenous factors, such as the concentration of the virus in aerosols emitted by the patient (variable during his contagiousness period), and exogenous factors, such as geo-climatic, demographic, socio-behavioral and economic factors governing pathogen transmission (variable during the outbreak's history) [4][5][6][7][8].
Due to these exogenous factors, R 0 might change seasonally, but these factor variations are not significant if a very short period of time is considered. R 0 depends also on endogenous factors such as the viral load of the infectious individuals during their contagiousness period, and the variations in this viral load [9][10][11][12][13][14][15] must be considered in both theoretical and applied studies on the COVID-19 outbreak, in which the authors estimate a unique reproduction number R 0 linked to the Malthusian growth parameter of the exponential phase of the epidemic, during which R 0 is greater than 1 (Figure 1). The corresponding model has been examined in depth, because it is useful and important for various applications, but the distribution of the daily reproduction number R j at day j of an individual's contagiousness period is rarely considered within a stochastic framework [16][17][18][19][20]. Figure 1. Spread of an epidemic disease from the first infectious "patient zero" (in red), located at the center of its influence sphere comprising the successive generations of infected individuals, for the same value of the reproduction number R 0 = 3, with a deterministic dynamic (left) and a stochastic one (right), with standard deviation σ of the uniform distribution on an interval centered on R 0 and with a random variable time interval i between infectious generations (after [16]).
We therefore defined a partial reproduction number for each day of an individual's contagiousness period, and, assuming initially that this number was the same for all individuals, we obtained the evolution equation for the number of new daily cases in a population. Assuming that the distribution of partial reproduction numbers (referred to as daily for simplicity) was subject to fluctuations, we calculated the consequences for their estimation, and we estimated them for a large number of countries, taking a duration of contagiousness of 3 followed by 7 days.
When this distribution is considered, it is possible to calculate its entropy as a parameter quantifying its uniformity and to simulate the dynamics of the infectious disease either using a Markovian model such as that defined in Delbrück's approach [17] or a classical discrete or ODE SIR deterministic model. In the Markovian case, R 0 can be calculated from the evolutionary entropy defined by L. Demetrius as the Kolmogorov-Sinaï entropy of the corresponding random process [18], which measures the stability of the invariant measure, dividing the population into the subpopulations S (individuals susceptible to but not yet infected with the disease), I (infectious individuals) and R (individuals who have recovered from the disease and now have immunity to it). In the deterministic case, R 0 corresponds to the Malthusian parameter quantifying its exponential growth, and the stability of the asymptotic steady state depends on the subdominant eigenvalue [19,20].

Calculation of R 0
In epidemiology, there are essentially two broad ways to calculate R 0 , which correspond to the individual-level modeling and to the population-level modeling. At the individual level, if we suppose the susceptible population size constant (hypothesis valid during the exponential phase of an epidemic), the daily reproduction rates of an individual are typically non-constant over his contagiousness period, and the calculations we present in the following define a new method for estimating R 0 , as the sum of the daily reproduction rates. This new approach allows us to have a clearer view on the respective influence on the transmission rate by endogenous factors (depending on the level of immunologic defenses of an individual) and exogenous factors (depending on environmental conditions).

Materials and Methods
The methodology chosen starts from an attempt to reconstruct an epidemic dynamic from the knowledge of the number R ikj of people infected at day j by a given infectious individual i during the kth day of his period of contagiousness of length r. By summing up the number of new infectious individuals X j−k present on day j − k where started their contagiousness, we find that the number of new infected people on day j is equal to: We will assume in the following that R ikj is the same, equal to R k , for all individuals I and day j, then depends only on day k. Then, we have: The convolution Equation (2) is the basis of our modelling of the epidemic dynamics.

The Contagion Mechanism from a First Infectious Case Zero
Let us suppose that the secondary infected individuals are recruited from the centre of the sphere of influence of an infectious case zero and that the next infected individuals remain on a sphere centred on case 0, by just widening its radius on day 2. Therefore, the susceptible individuals C(j), which each infectious on day j − 1 can recruit, are on a part of the sphere of influence of case 0 reached at day j (rectangles on Figure 2).

Figure 2.
Spread of an epidemic disease from a first infectious case 0 (located at its influence sphere centre) progressively infecting its neighbours in some regions (rectangles) on successive spheres.

The Biphasic Pattern of the Virulence Curve of Coronaviruses
Mostly, the clinical course of patients with seasonal influenza shows a biphasic occurrence of symptoms with two distinct peaks. Patients have a classic influenza disease followed by an improvement period and a recurrence of the symptoms [11]. The influenza RNA virus shedding (the time during which a person might be contagious to another person) increases sharply one half to one day after infection, peaks on day 2 and persists for an average total duration of 4.5 days, between 3 and 6 days, which explains why we will choose in the following contagiousness duration these extreme values, i.e., either 3 or 6 days, depending on the positivity of the estimated daily reproduction numbers. It is common to consider this biphasic evolution of influenza clinically: after incubation of one day, there is a high fever (39-40 • C), then a drop in temperature before rising, hence the term "V" fever. The other symptoms, such as coughing, often also have this improvement on the second day of the flu attack: after a first feverish rise (39-39.5 • C), the temperature drops to 38 • C on the second day, then rises before disappearing on the 5th day, the fever being accompanied by respiratory signs (coughing, sneezing, clear rhinorrhea, etc.). By looking at the shape of virulence curves observed in coronavirus patients [12][13][14][15][16], we often see this biphasic pattern.

Relationships between Markovian and ODE SIR Approaches
In the following, we suppose that the susceptible population size remains constant, which constitutes a hypothesis valid during the exponential phase of epidemic waves. The Markovian stochastic and ODE deterministic approaches are linked by a common background consisting of the birth and death process approach used in the kinetics of molecular reactions by Delbrück [17], then in dynamical systems theory by numerous authors [18][19][20][21][22][23], namely in modelling of the epidemic spread in exponential growth. In the ODE approach, the Malthusian parameter is the dominant eigenvalue, and the equivalent in the Markovian approach is the Kolmogorov-Sinai entropy (called evolutionary entropy in [24][25][26]).

First Method for Obtaining the SIR Equation from a Deterministic Discrete Mechanism
Let us suppose the model is deterministic and denote by X j the number of new infected cases at day j (j ≥ 1), and R k (k = 1, . . . , r) the daily reproduction number at day k of the contagiousness period of length r for all infectious individuals. Then, we have obtained Equation (2) by supposing that the contagiousness behaviour is the same for all the infectious individuals: which says that the X j−k new infected at day j − k give R k X j−k new infected on day j, throughout a period of contagiousness of r days, the R k 's being possibly different or zero. For example, if r = 3, for the number X 5 of new cases at day 5, equation X 5 = R 1 X 4 + R 2 X 3 + R 3 X 2 means that new cases at day 4 have contributed to new cases at day 5 with the term R 1 X 4 , R 1 being the reproduction number at first day of contagiousness of new infected individuals at day 4.
In matrix form, we obtain: where X = (X j , . . . , X j−r−1 ) and R = (R 1 , . . . , R r ) are r-dimensional vectors and M is the following r-r matrix: It is easy to show that, if X 0 = 1 and r = 5 (estimated length of the contagiousness period for COVID-19 [12][13][14][15][16][17][18][19][20][21]), we obtain: The length r of the contagiousness period can be estimated from the ARIMA series of the stationary random variables Y j 's, equal to the X j 's without their trend, by considering the length of the interval on which the auto-correlation function remains more than a certain threshold, e.g., 0.1 [4]. For example, by assuming r = 3, if R 1 = a, R 2 = b and R 3 = c, we obtain: X 0 = 1, X 1 = a, X 2 = a 2 + b + c, X 3 = a 3 + 2ab, X 4 = a 4 + 3a 2 b + b 2 + 2ac, X 5 = a 5 + 4a 3 b + 3ab 2 + 3a 2 c + 2bc, X 6 = a 6 + 5a 4 b + 4a 3 c + 6a 2 b 2 + 6abc + b 3 + c 2 , X 7 = a 7 + 6a 5 b + 5a 4 c + 10a 3 b 2 + 12a 2 bc + 4ab 3 + 3b 2 c + 3ac 2 If R 1 and R 2 are equal, respectively, to a and b, and if a = b = R/2, c = 0, then, X 5 behaves like: X 5 = R 5 /32 + R 4 /4 + 3R 3 /8 If R = 2, {X j } i=1,∞ is the Fibonacci sequence, and more generally, for R > 0, the generalized Fibonacci sequence. Let us suppose now that b = c = 0 and a depends on the day j: a j = > C(j), where C(j) represents the number of susceptible individuals, which can be met by one contagious individual at day j. If infected individuals (supposed to all be contagious) at day j are denoted by I j , we have: Let us suppose, as in Section 2.1, that the first infectious individual 0 recruits from the centre of its sphere of influence secondary infected individuals remaining in this sphere, and that the susceptible individuals recruited by the I j infectious individuals present at day j are located on a part of the sphere of centered on the first infectious 0 obtained by widening its radius ( Figure 2). Then, we can consider that the function C(j) increases, then saturates due to the fact that an infectious individual can meet only a limited number of susceptible individuals as the sphere grows. We can propose for C(j) the functional form C(j) = S(j)/(c + S(j)), where S(j) is the number of susceptible individuals at day j. Then, we can write the following equation, taking into account the mortality rate µ: This discrete version of epidemic modeling is used much less than the classic continuous version, corresponding to the ODE SIR model, with which we will show a natural link. Indeed, the discrete Equation (9) is close to SIR Equation (10), if the value of c is greater than that of S:

Second Method for Obtaining the SIR Equation from a Stochastic Discrete Mechanism
Another way to derive the SIR equation is the probabilistic approach, which comes from the microscopic equation of molecular shocks by Delbrück [17] and corresponds to a classical birth-and-death process: if at least one event (with rates of contact ν, birth f, death µ or recovering ρ) occurs in the interval (t, t + dt), and by supposing that births compensate deaths, leaving constant the total size N of the population, we have: Hence, we have, if P k (t) denotes Probability({S(t) = k, I(t) = N − k}): and we obtain: Then, by multiplying by s k and summing over k, we obtain the characteristic function of the random variable S. If births do not compensate deaths, we have: Probability ({S(t + dt) = k, I(t + dt) = j}) = P(S(t) = k, I(t) = j) ( (12) If S and I are supposed to be independent and if the coefficients ν, f, µ and ρ are sufficiently small, S and I are Poisson random variables [27], whose expectations E(S) and E(I) verify: leading to the SIR equation for the variables S, I and R considered as deterministic: If R 0 denotes the basic reproduction number (or average transmission rate) in a givenpopulation, we can estimate the distribution V (whose coefficients are denoted V j = R j /R o ) of the daily reproduction numbers R j along the contagious period of an individual, by remarking that the number X j of new infectious cases at day j is equal to X j = I j − I j−1 , where I j is the cumulated number of infectious at day j, and verifies the convolution equation (equivalent to Equation (2)): where r is the duration of the contagion period, estimated by 1/(ρ + µ), ρ being the recovering rate and µ the death rate in SIR Equation (14). r and S can be considered as constant during the exponential phases of the pandemic, and we can assume that the distribution V is also constant; then, V can be estimated by solving the linear system (equivalent to Equation (3)): where M is given by Equation (4). Equation (16) can be solved numerically, if the pandemic is observed during a time greater than 1/(ρ + µ). We will first demonstrate an example of how the matrix M can be repeatedly calculated for consecutive periods of length equal to that of the contagiousness period (supposed to be constant during the outbreak), giving matrix series M 1 , M 2 , . . . Following Equation (4), we put the values of X i 's in the two matrices below, with r = 3 for two periods, the first from day 1 to day 3 and the second from day 4 to day 6.
where, after Equation (6), M 1 and M 2 can be calculated from the R j 's as: and M 2 is given by: Additionally, from Equation (2), if, for instance, j = 8 and r = 3, then we have the expression below, which means that the new cases on the 8th day depend on the new cases detected on the previous days 7, 6 and 5, supposed to be in a period of contagiousness of 3 days: Let us suppose now that the initial R j 's on a contagiousness period of 3 days, are equal to: gives the R j 's from Equation (16), hence allows the calculation of X j = Σ k=1,3 R k X j−k .
The inverse of M is denoted by M −1 and verifies: R = M −1 X, where X = (X 6 , X 5 , X 4 ), with X 1 = 1, X 2 = 2, X 3 = 5, X 4 = 14, X 5 = 37, X 6 = 98 and we obtain: and a deconvolution gives the resulting R j 's:  , thanks to the following calculation: We obtain for the resulting distribution of daily reproduction numbers the exact replica of the initial distribution. We obtain the same result by replacing M 1 by the matrix M 2 .

Distribution of the Daily Reproduction Numbers R j 's When They Are Supposed to Be Random
Let us consider a stochastic version of the deterministic toy model corresponding to Equation (17), by introducing an increasing noise on the R j 's, e.g., by randomly choosing their values following a uniform distribution on the three intervals: [2 − a, 2 + a], [1 − a/2, 1 + a/2] and [2 − a, 2 + a] (for having a U-shape behavior), with increasing values of a, from 0.1 to 1, in order to see when the deconvolution would give negative resulting R j 's, with conservation of the average of their sum R 0 , if the random choice of the values of the R j 's at each generation is repeated, following the stochastic version of Equation (2): X j = Σ k=1,r (R k + ε k ) X j−k , where r is the contagiousness period duration and ε k is a noise perturbing R k , whose distribution is chosen uniform on the interval [0, 2a] for k = 1,3, and [0, a] for k = 2. This choice is arbitrary, and the main reason of the randomization is to show that the deconvolution can give negative results for R k 's, as those observed for increasing values of a, from 0.1 to 1, with explicit calculations for three consecutive periods, from day 1 to day 3, from day 4 to day 6, and from day 7 to day 9.
For each random choice of the values of the daily reproduction numbers R j 's, we can calculate a matrix M 1 corresponding to Equation (3). Its inversion into the matrix M 1 −1 makes it possible to solve the problem of deconvolution of Equation (2)-that is to say, to obtain new R j 's as a function of the observed X k 's. We can then calculate a new matrix M 2 from these new Rj's and thus continue during an epidemic the estimation of the daily reproduction numbers R j 's from the successive matrices M 1 , M 2 , . . . , and observed X k 's.

1.
For a = 0.1, let us randomly and uniformly choose the initial distribution of the daily reproduction numbers R 1 in the interval [   and its inverse is given by: New cases are: X 6 = 18.209, X 5 = 9.101, X 4 = 4.81, X 3 = 2.355, X 2 = 1, X 1 = 1, and by deconvoluting, we obtain the resulting R j 's equal to: R 1 = 1, R 2 = 1.355, R 3 = 1.1, i.e., the exact initial distribution.
Let us now consider new initial R j 's: R 1 = 1, R 2 = 1, R 3 = 1. That gives a new matrix M 2 , with new X 7 and X 8 calculated from the new initial R j 's, by using the former values of X 6 , . . . , X 2 : X 7 = X 6 + X 5 + X 4  More precise simulation results are given in Table 1, which summarizes computations made for random choices of R j 's distributions, for a = 0.1 and a = 1 and until time 20. These simulations show a great sensitivity to noise, but a qualitative conservation of their U-shaped distribution along the contagiousness period of individuals. More precisely, because of the presence of noise on the Rj's, we cannot always obtain positive values from the data for the Rj's by applying the deconvolution, which explains the presence of negative values in empirical examples, as in the theoretical noised examples. A way to solve this problem could be to suppose that noise is stationary during all of the growth period of a wave, then calculate the Rj's for all running time windows of length equal to the contagiousness duration and then obtain the mean of the Rj's corresponding to these windows. As this stationary hypothesis is not widely accepted, we prefer to keep negative values and focus on the shape of the distribution of the Rj's.  Figure 3 gives the effective transmission rates R e calculated between 20-25 October 2020 just before the second lockdown in France [28,29]. As the second wave of the epidemic is still in its exponential phase, it is more convenient (i) to consider the distribution of the marginal daily reproduction numbers and (ii) to calculate its entropy and simulate the epidemic dynamics using a Markovian model [4]. By using the daily new infected cases given in [30], we can calculate, as in Section 3.1, the inverse matrix M −1 for the period from 20 to 25 October 2020 (exponential phase of the second wave), by choosing 3 days for the duration of contagiousness period and the following raw data for new infected cases: 20,468 for 20 October, then 26,676, 41,622, 42,032, 45,422 and 52,010 for 25 October. Then, for France between 15 February and 27 October 2020, we obtain the daily reproduction numbers given in Figure 3 with a U-shape as observed for influenza viruses.
We have: The effective reproduction number is equal to R 0 ≈ 1.174, a value close to that calculated directly (Figure 3 Figure 4. Top: estimation of the effective reproduction number R e 's for the 1 November and the 12 November 2020 (in green, with their 95% confidence interval) [28,29]. Bottom left: Daily new cases in Chile between 1 November and 12 November [30]. Bottom right: U-shape of the evolution of the daily R j 's along the infectious 6-day period of an individual.
Hence, after deconvolution, we obtain: The effective reproduction number is equal to R 0 ≈ 1.011, a value close to that calculated directly, with a maximal daily reproduction number the last day of the contagiousness period. Due to the negativity of R 1 , we cannot derive the distribution V and therefore calculate its entropy. As entropy is an indicator of non-uniformity, an alternative could be to calculate it by shifting values of Rj's upwards by the value of their minimum.
The quasi-endemic situation in Chile since the end of August, which corresponds to the increase of temperature and drought at this period of the year [4], gives a cyclicity of the new cases occurrence whose period equals the length of the contagiousness period of about 6 days, analogue to the cyclic phenomenon observed in simulated stochastic data of Section 3.2. with a similar U-shaped distribution of the R j 's.

Russia
By using the daily new infected cases given in [30], we can calculate M −1 for the period from 30 September to 5 October 2020 (exponential phase of the second wave), by choosing 3 days for the duration of the contagiousness period and the following raw data for new infected cases (   The effective reproduction number is equal to R 0 ≈ 1.073, a value close to that calculated directly, with a maximal daily reproduction number the first day of the contagiousness period. Due to the negativity of R 2 , we cannot derive the distribution V and therefore calculate its entropy. The period studied corresponds to a local slow increase of new infected cases at the start of the second wave in Russia, which looks like a staircase succession of slightly inclined 4-day plateaus followed by a step: at the beginning of October, in Russia, new tightened restrictions (but avoiding lockdown) appeared [31], which could explain the change of the value of the slope observed in the new daily cases [30].

Nigeria
By using the daily new infected cases given in [30], we can calculate M −1 for the period from 5 November to 10 November (endemic phase), by choosing 3 days for the duration of the contagiousness period and the following raw data for the new infected cases (    The effective reproduction number is equal to R 0 ≈ 1.129, value close to that calculated directly, with a maximal daily reproduction number the last day of the contagiousness period. The distribution V equals (0.143, 0.342, 0.515) and its entropy H is equal to: H = −Σ k=1,r V k Log(V k ) = 0.29 + 0.37 + 0.34 = 1.
In Appendix C, Table A1 gives the shape of the R j 's distribution for 194 countries.

Weekly Patterns in Daily Infected Cases
Daily new infected cases are highly affected by weekdays, such that case numbers are lowest at the start of the week and increase afterwards. This pattern is observed at the world level, as well as at the level of almost every single country or USA state. Hence, in order to estimate biologically meaningful reproduction numbers, clean of weekly patterns due to administrative constraints, analyses have to be restricted to specific periods shorter than a week, or at rare occasions when patterns escape the administrative constraints. This weekly phenomenon occurs during exponential increase as well as decrease phases of the pandemic and during endemic periods in numbers of daily cases ( Figure 6). In addition, the daily new infected case record is discontinuous for many countries/regions, which frequently publish, on Monday or Tuesday, a cumulative count for that day and the weekend days. For example, Sweden typically publishes only four numbers over one week, the one on Tuesday cumulating cases for Saturday, Sunday and the two first weekdays. Discontinuity in records further limits the availability of data enabling detailed analyses of daily reproduction numbers and can be considered as extreme weekday effects on new case records due to various administrative constraints.
We calculated Pearson correlation coefficients r between a running window of daily new case numbers of 20 consecutive days and a running window of identical duration with different intervals between the two running windows. These Pearson correlation coefficients r typically peak with a lag of seven days between the two running windows.
The mean of these correlations are for each day of the week from Tuesday (data making up for the weekend underestimation) to Monday: 0.571, 0.514 (0.081), 0.383 (0.00008), 0.347 (0.000003), 0.381 (0.000006), 0.468 (0.000444) and 0.558 (0.03916), with, in parentheses, the p-value of the one-tailed paired t-test showing that the correlation observed with running windows starting Tuesday are more than the others (see also supplementary material). This could reflect a biological phenomenon of seven infection days. However, examination of the frequency distributions of lags for r maxima reveals, besides the median lag at 7 days, local maxima for multiples of 7 (14, 21, 28, 35, etc.). About 50 percent of all local maxima in r involve lags that are multiples of seven (seven included).
This excludes a biological causation, except if data periodicity comes from an entrainment by the weekly "Zeitgeber" of census, near the duration of the contagiousness interval. We tried to control for weekdays using two methods, and combinations thereof. For the first method, we calculated z-scores for each weekday, considering the mean number of cases for each weekday, and subtracted that mean from the observed number for a day (Figure 7). This delta was then divided by the standard deviation of the number of cases for that weekday. The mean and standard variation are calculated across the whole period of study for each weekday.
The second method implies data smoothing using a running window of 5 consecutive days, where the mean number of new cases calculated across the five days is subtracted from the number of new cases observed for the third day. Hence, data for a given day are compared to a mean including two previous, and two later days (Figure 8).
We constructed two further datasets, where z-scores are applied in the first to data after smoothing from the second method and are applied in the second data after smoothing from the first method (not shown) (Figures 9 and 10).
These four datasets from daily new cases database [30] transformed according to different methods and combinations thereof designed to control for weekday were analysed using the running window method. Despite attempts at controlling for weekday effects, the median lag was always seven days across all four transformed datasets, and local maxima in lag distributions were multiples of seven. After data transformations, about 50 percent of all local maxima were lags that are multiples of seven, seven included.  Visual inspection of plots of these transformed data versus time for daily new infected cases from the whole world shows systematic local biases in daily new infected cases (after transformation) on Sundays and Mondays, for all four transformed datasets, with Sundays and/or Mondays as local minima and/or local maxima, according to which method or combination thereof was applied to the data. Hence, the methods we used failed to neutralize the weekly patterns in daily new cases due to administrative constraints. This issue highly limits the data available for detailed analyses of daily new cases aimed at estimating biologically relevant estimates of reproduction numbers at the level of short temporal scales.   [30] applied to z-scores from Figure 8, as a function of days since 26 February 2020 until 23 August 2020 + indicates Sundays, X indicates Mondays. Z-transformations are specific to each weekday. For specific day j, the mean number of confirmed new cases calculated for days j − 1, j − 2, j, j + 1, j + 2 is subtracted from the number for day j.
By smoothing on five consecutive days of raw data (confirmed world daily new infected cases [24]) and applying the z-transformation, we obtain a better result in Figure 11 than in Figure 10 in order to neutralize the weekly pattern. The reason is that the smoothing largely eliminates the counting defect during weekends due either to fewer hospital admissions and/or less systematic PCR tests or to a lack of staff at the end of the week to perform the counts.

Discussion
The duration of the contagiousness period, as well as the daily virulence, are not constant over time. Three main factors, which are not constant during a pandemic, can explain this: - In the virus transmitter, the transition between the mechanisms of innate (the first defense barrier) and adaptive (the second barrier) immunity may explain a transient decrease in the emission of the pathogenic agent during the phase of contagiousness [15], - In the environmental transmission channel, many geophysical factors that vary over time can influence the transmission of the virus (temperature, humidity, altitude, etc.) [4][5][6][7][8], - In the recipient of the virus, individual or public policies of prevention, protection, eviction or vaccination, which evolve according to the epidemic severity and the awareness of individuals and socio-political forces, can change the sensitivity of the susceptible individuals [32].
It is therefore very important to seek to estimate the average duration of the period of contagiousness of individuals and the variations, during this phase of contagiousness, of the associated daily reproduction numbers [33][34][35][36][37][38][39]. If the duration of the contagiousness phase is more than 3-5 days, for example ±7 days, the periodicity of seven days observed for the new daily cases could result of an entrainment of the dynamics of new cases driven by the social "Zeitgeber" represented by the counting of new cases, less precise during the weekend (probably underestimated in many countries not working at this time). That questions the deconvolution over 3 and 5 days, giving some negative R j . In a future work, we will compare our results with those obtained by deconvolutions on contagiousness durations between 3 and 12 days in order to obtain possibly more realistic values for the R j 's, and hence, have perhaps a double explanation for the 7 days periodicity, both sociological and biological. Before this future work, we have extended our study using a duration r = 3 of contagiousness to r = 7. The results are given in Appendix B: they show the same existence of identical variations of U-shape type but they specify the values of R j 's, more often positive and of more realistic magnitude, while keeping a sum approximately equal to R 0 .
Rhodes and Demetrius have pointed out the interest of the distribution of the daily reproduction numbers [24] with respect to the classical unique R 0 , even time-dependent [25]. In particular, they found that this distribution was generally not uniform, which we have confirmed here by showing many cases where we observe the biphasic form of the virulence already observed in respiratory viruses, such as influenza. The entropy of the distribution makes it possible to evaluate the intensity of its corresponding U-shape. This entropy is high if the daily reproduction numbers are uniform, and it is low if the contagiousness is concentrated over one or two days. If some Rj are negative, it is still possible to calculate this uniformity index, by shifting their distribution by a translation equal to the inverse of the negative minimum value.
We have neglected in the present study the natural birth and death rates by supposing them identical, but we could have taken into account the mortality due to the COVID-19. The discrete dynamics of new cases can be considered as Leslie dynamics governed by the matrix equation: where X j is the vector of the new cases living at day j and L is the Leslie matrix given by: . . , r, is the recovering probability between days j and j + 1. The dynamical stability for L 2 distance to the stationary infection age pyramid P = lim j X j /Σ i=j,j−r+1 X i is related to |λ − λ |, the modulus of the difference between the dominant and sub-dominant eigenvalues of L, namely λ = e R and λ , where R is the Malthusian growth rate and P is the left eigenvector of L corresponding to λ. The dynamical stability for the distance (or symmetrized divergence) of Kullback-Leibler to P considered as stationary distribution is related to the population entropy H [26][27][28][29][30][31][32], which is defined if l j = ∏ i=1,j−1 b i and p j = l j R j /λ j , as follows: H = −Σ j=1,r p j Log(p j )/Σ j=1,r j p j (18) The mathematical characterization by the population entropy defined in Equation (16) of the stochastic stability of the dynamics described by Equation (16) has its origin in the theory of large deviations [40][41][42]. This notion of stability pertains to the rate at which the system returns to its steady state after a random exogenous and/or endogenous perturbation and it could be useful to quantify further the variations of the distribution of the daily reproduction numbers observed for many countries [43][44][45][46][47][48][49][50][51][52][53].
In summary, the main limitations of the present study are: -The hypothesis of spatio-temporal stationarity of the daily reproduction numbers is no longer valid in the case of rapid geo-climatic changes, such as sudden temperature rises, which decrease the virulence of SARS CoV-2 [4], or mutations affecting its transmissibility. - The still approximate knowledge of the duration r of the period of contagiousness necessitates a more in-depth study at variable durations, by retaining the value of r, which makes all of the daily reproduction numbers positive. - The choice of uniform random fluctuations of the daily reproduction numbers is based on arguments of simplicity. A more precise study would undoubtedly lead to a unimodal law varying throughout the contagious period, the average of which following a Ushaped curve, of the type observed in the literature on a few real patients [10,[54][55][56][57][58].

Conclusions and Perspectives
Concerning contagious diseases, public health physicians are constantly faced with four challenges. The first concerns the estimation of the basic reproduction number R 0 . The systematic use of R 0 simplifies the decision-making process by policymakers, advised by public health authorities, but it is too much of a caricature to account for the biology behind the viral spread. We have observed in the COVID-19 outbreak that it was non-constant during an epidemic wave due to exogenous and endogenous factors influencing both the duration of the contagiousness period and the daily transmission rate during this phase [54][55][56]. Then, the first challenge concerns the estimation of the mean duration of the infectious period for infected patients. As for the transmission rate, realistic assumptions made it possible to obtain an upper limit to this duration [45], mainly due to the lack of viral load data in large patient cohorts (see Figure A1 in Appendix A from [57][58][59]), in order to better guide the individual quarantine measures decided by the authorities in charge of public health. This upper bound also makes it possible to obtain a lower bound for the percentage of unreported infected patients, which gives an idea of the quality of the census of cases of infected patients, which is the second challenge facing specialists of contagious diseases. The third challenge is the estimation of the daily reproduction number over the contagiousness period, which was precisely the topic of the present paper. A fourth interesting challenge for this community is the extension of the methods developed in the present paper to the contagious non-infectious diseases (i.e., without causal infectious agent), such as social contagious diseases [59][60][61], the best example being that of the pandemic linked to obesity, for which many concepts and modelling methods remain available.
Eventually, our approach using marginal daily reproduction numbers involving a certain level of noise in the dynamics of new daily infected cases defines a stochastic framework which describes phenomenologically the exponential phase as our results show for countries such as France, Russia, Sweden, etc. This stochastic modelling allows a better understanding of the role of the contagiousness period length and of the heterogeneity (e.g., the U-shape) of its daily reproduction number distribution in the COVID-19 outbreak dynamics [62][63][64][65]. On the medical level, the important message about the U-shape is that COVID-19 is similar to other viral diseases, such as influenza, with two successive reactions from the two immune defense barriers, innate cellular immunity first, which is not sufficient if symptoms persist, then adaptive immunity (cellular and humoral), which results in a transient decrease in contagiousness between the two phases. The medical recommendations are, in this case, never to take a transient improvement for a permanent disappearance of the symptoms. One could indeed, for a public health use, be satisfied after estimating the sum of the R j 's, that is to say, R 0 or the effective R e . For an individual health use, it is important to know the existence of a minimum of the R j 's, which generally corresponds to a temporary clinical improvement, after the partial success of the innate immune defenses. This makes it possible to prevent the patient from continuing to respect absolute isolation and therapeutic measures, even if a transient improvement occurs; otherwise, they risk, as in the flu, a bacterial pulmonary superinfection (a frequent cause of death in the case of COVID-19). On the theoretical level, the interest of the proposed method is its generic character: it can be applied to all contagious diseases, within the very general framework of Equation (1), which makes no assumption about the spatial heterogeneity or the longitudinal constancy of the daily reproduction numbers. The deconvolution of Equation (1)

Acknowledgments:
The authors hereby give their thanks to the framework of the University of Excellence Concept "Research University in Helmholtz Association I Living the Change".

Conflicts of Interest:
The authors declare no conflict of interest. Figure A1 shows a U-shaped evolution for the viral load in real [57] and in simulated [58] COVID-19 patients, and in real influenza-infected animals for the viral load and the body temperature [59].     and we can represent the evolution of X j 's on Figure A2. The evolution of the Xj's along the period of contagiousness shows at day 4 a sharp increase and a saturation.

2.
Exponential phase in France from 25 October 2020 to 7 November 2020 The numbers of new cases are: The Figure A3 shows an evolution of the Xj's with a U-shape on the three first days along the period of contagiousness with a sum of R j 's equal to 1.11, close to the effective reproduction number R e = 1.13 [28].

3.
Beginning of the pandemic in the USA from 21 February 2020 to 5 March 2020 The evolution of the Xj's shows in Figure A4 a U-shape on day 4 with a sum of R j 's equal to 2.72, less than the effective reproduction number R e = 3.27 [28].

4.
USA exponential phase from 1 November 2020 to 4 November 2020 The evolution of the Xj's shows in Figure A5 a U-shape on the four last days with a sum of R j 's equal to 1.35, close to the effective reproduction number R e = 1.24 [28]. Figure A5. Values of the daily reproduction numbers R j along the period of contagiousness of length 7 days.

5.
Beginning of the pandemic in the UK from 23 February 2020 to 7 March 2020  Figure A6 shows an evolution of the Xj's with a U-shape on the three last days along the period of contagiousness with a sum of R j 's equal to 9.88, higher than the effective reproduction number R e = 2.95 [28].   Figure A7 shows an evolution of the Xj's with a U-shape on the five last days along the period of contagiousness with a sum of R j 's equal to 1.07, close to the effective reproduction number R e = 1.06 [28].  Table A1 is built from new COVID-19 cases at the start of the first and second waves for 194 countries; it shows 42 among these 194 countries having a U-shape evolution of their daily R j 's twice, for 12.12 ± 6 expected with 0.95 confidence (p < 10 −12 ), and 189 times, a U-shape evolution for all countries and waves (397), for 99.3 ± 9 expected with 0.95 confidence (p < 10 −24 ). Hence, the U-shape is the most frequent evolution of daily R j 's, which confirms the comparison with the behavior of seasonal influenza (see Section 2.2).